Package Information
Downloads: 54 weekly / 89 monthly
Latest Version: 0.3.9
Author: Bee2Bee Team
Documentation
n8n-nodes-bee2bee-indexer
This is an n8n community node that lets you index GitHub repositories and generate embeddings for RAG (Retrieval-Augmented Generation) systems.
Features
- 🐝 Multi-language support: Python, JavaScript, TypeScript, Rust, Go, Java, C, C++
- 🔍 Smart code parsing: Uses tree-sitter for accurate AST-based parsing
- 🧠 Dual embeddings: Generates both NLP and code-specific embeddings
- ⚡ Flexible output: Choose between full data, chunks only, or metadata only
- 🔐 Multiple providers: Local embeddings (free) or OpenAI (paid)
- 🎯 Customizable chunking: Function-level, class-level, or file-level strategies
Installation
Follow the installation guide in the n8n community nodes documentation.
Community Node Installation
- Go to Settings > Community Nodes.
- Select Install.
- Enter
n8n-nodes-bee2bee-indexerin Enter npm package name. - Agree to the risks of using community nodes.
- Select Install.
Manual Installation
To get started locally, install the dependencies:
cd n8n-node
npm install
Build the node:
npm run build
Link it to your local n8n installation:
npm link
Then in your n8n custom directory (~/.n8n/custom/):
npm link n8n-nodes-bee2bee-indexer
Credentials
This node requires the following credentials:
- GitHub Token: Personal Access Token for downloading repositories
- OpenAI API Key (optional): Only needed if using OpenAI embeddings
- Embedding Provider: Choose between
local(free) oropenai(paid)
Operations
Index Repository
Downloads a GitHub repository and generates embeddings for all code files.
Parameters:
- Repository Owner (required): GitHub username or organization
- Repository Name (required): Repository name
- Branch (required): Git branch to index (default:
main) - Output Format:
Full: Metadata + Chunks + EmbeddingsChunks + Embeddings: Only code chunks with embeddingsChunks Only: Code chunks without embeddingsMetadata Only: Repository statistics only
Additional Options:
- Max Files: Limit number of files to process (0 = no limit)
- File Extensions: Comma-separated list of extensions to include
- Exclude Patterns: Directories to exclude (e.g.,
node_modules,dist) - Include Docstrings: Extract and include documentation
- Chunk Strategy:
function,class, orfilelevel chunking
Output
The node outputs a JSON object with the following structure:
{
"success": true,
"repository": {
"owner": "facebook",
"name": "react",
"branch": "main",
"fullName": "facebook/react"
},
"statistics": {
"totalFiles": 150,
"processedFiles": 145,
"totalChunks": 1234,
"languageBreakdown": {
"javascript": 80,
"typescript": 65
}
},
"chunks": [
{
"id": "unique_id",
"code": "function example() {...}",
"metadata": {
"file_path": "src/index.js",
"language": "javascript",
"chunk_type": "function",
"name": "example",
"lines": [10, 25]
},
"embeddings": {
"nlp": [0.1, 0.2, ...],
"code": [0.3, 0.4, ...]
}
}
]
}
Usage in n8n Workflows
Example: Index → Store in Vector DB
[Schedule Trigger] → [Bee2Bee Indexer] → [Pinecone] → [Webhook]
- Bee2Bee Indexer node processes the repository
- Output is sent to Pinecone (or ChromaDB/Qdrant/Weaviate)
- Final webhook confirms indexing is complete
Example: Search Flow
[Webhook] → [Pinecone Search] → [OpenAI] → [Response]
- User sends search query via webhook
- Pinecone searches indexed embeddings
- OpenAI uses retrieved chunks for context
- Response sent back with answer
Compatibility
Tested with n8n version 1.0.0+