Agentic RAG Supabase

Handle RAG operations with Supabase pgvector for PDF/TXT files

Actions9

File Actions
Vector Actions
- Upsert Vector
- Search Vector
Agentic RAG Actions

Overview

The Agentic RAG Full Pipeline node implements a Retrieval-Augmented Generation (RAG) workflow using Supabase's pgvector extension and OpenAI's GPT-3.5-turbo model, combined with Hugging Face embeddings. It supports ingesting documents, embedding their content into a vector database, and then processing user queries by retrieving relevant document chunks and generating refined answers iteratively.

This node is beneficial in scenarios where you want to build an AI assistant or knowledge base that can answer questions based on custom documents such as PDFs, text files, or DOCX files. For example, it can be used to:

Ingest company manuals or product documentation and answer employee queries.
Build a customer support bot that references internal knowledge bases.
Create research assistants that summarize and answer questions from scientific papers.

The Full Pipeline operation automates the entire flow: ingesting a document, embedding its content, storing vectors, and then processing a query against this data with iterative refinement for improved answers.

Properties

Name	Meaning
Query	The question or search query string to process against the ingested documents.
Top K	Number of top matching document chunks to retrieve during vector similarity search (default 5).
Document Path	File path to the document to ingest (PDF, TXT, DOCX supported).
Max Iterations	Maximum number of iterations for query refinement and answer generation (default 3).
OpenAI API Key	Required API key for OpenAI to generate answers and evaluate responses.
Similarity Threshold	Cosine similarity threshold for filtering retrieved document chunks (default 0.78).

Output

The output JSON object from the Full Pipeline operation includes:

query: The original user query string.
documentPath: The path of the ingested document or "none" if no document was ingested.
finalAnswer: The best generated answer after iterative retrieval and refinement.
bestScore: The highest evaluation score achieved by any iteration's answer.
pipeline: An array detailing each step of the pipeline:
- step: The pipeline stage name ("ingestion" or "query_processing").
- result: The detailed result object from that step, including ingestion stats or query iteration details.
pipelineComplete: Boolean indicating successful completion of the full pipeline.

The node does not output binary data.

Dependencies

Supabase: Used as the vector database backend with pgvector extension for storing and querying document embeddings.
OpenAI API: Requires an API key to access GPT-3.5-turbo for answer generation, evaluation, and query refinement.
Hugging Face Inference API: Used to generate vector embeddings for document chunks via the "thenlper/gte-small" model.
File System Access: Reads local files (PDF, TXT, DOCX) for ingestion.
Node.js Modules: Uses libraries like pdf-parse, mammoth (for DOCX), csv-parser, and others for file parsing and structured extraction.

n8n Configuration: The node requires credentials configured for accessing Supabase and the OpenAI API key securely.

Troubleshooting

Common Issues

Unsupported File Type: If the document path points to a file type other than PDF, TXT, or DOCX, the node will throw an error.
API Key Errors: Missing or invalid OpenAI API keys will cause failures in answer generation or evaluation steps.
Embedding Failures: Network issues or invalid Hugging Face API keys can cause embedding requests to fail.
Supabase RPC Errors: Problems with the vector database setup or SQL execution may cause errors during vector upsert or search operations.
Empty Search Results: If no relevant documents are found for a query, the node returns an iteration result indicating retrieval failure.

Error Messages and Resolutions

"Unsupported file type: .xyz": Use supported file formats (PDF, TXT, DOCX).
"Answer generation error: ...": Check OpenAI API key validity and network connectivity.
"Embedding error: ...": Verify Hugging Face API key and service availability.
"Unknown agentic RAG operation: ...": Ensure the correct operation name is selected.
"No documents found" in iteration results: Confirm the document was ingested properly and contains relevant content.

Links and References

This summary covers the static analysis of the Agentic RAG node’s Full Pipeline operation, describing its inputs, outputs, dependencies, and potential troubleshooting tips.

Agentic RAG SupabaseInstall