ollama-reranker

Ollama Reranker for n8n - VL Classifier integration + Auto-detect + Dynamic model loading (Vector Store provider + workflow node)

Documentation

n8n-nodes-ollama-reranker

npm version
License: MIT

Advanced Reranker Provider for n8n - Supporting Ollama-compatible APIs, Custom Rerank servers, and VL Classifiers.

โš ๏ธ Important Note: Sorry folks, Ollama doesn't natively support reranker models! We're developing our own solution to bring powerful reranking capabilities to n8n. This package works with Ollama-compatible APIs that implement reranking through prompt-based scoring, custom rerank endpoints, and now Vision-Language classification servers.

Features

  • ๐ŸŽฏ Integrates seamlessly with n8n Vector Store nodes
  • ๐Ÿš€ Multiple API types: Ollama Generate, Custom Rerank, VL Classifier
  • ๐Ÿค– Auto-detection of server capabilities
  • ๐Ÿ”ง Multiple models supported (BGE Reranker, Qwen3 family)
  • ๐ŸŽจ VL Classification for document complexity analysis (v1.4.0+)
  • โšก Concurrent processing with configurable batch sizes
  • ๐Ÿ”„ Automatic retries with exponential backoff
  • ๐Ÿ“Š Flexible scoring with threshold and topK parameters

Installation

Via npm (Recommended)

npm install n8n-nodes-ollama-reranker

Via n8n Community Nodes UI

  1. Go to Settings โ†’ Community Nodes
  2. Select Install
  3. Enter n8n-nodes-ollama-reranker
  4. Click Install

From Docker

Add to your n8n Dockerfile:

FROM n8nio/n8n:latest
USER root
RUN cd /usr/local/lib/node_modules/n8n && \
    npm install n8n-nodes-ollama-reranker
USER node

Prerequisites

Choose your server type:

Option 1: Ollama (Prompt-based reranking)

  1. Ollama must be running and accessible
  2. Pull a reranker model:
# Recommended - BGE Reranker v2-M3
ollama pull bge-reranker-v2-m3

# Or Qwen3 models
ollama pull dengcao/Qwen3-Reranker-4B:Q5_K_M

Option 2: Custom Rerank API

Use any service implementing /api/rerank endpoint (like deposium-embeddings-turbov2)

Option 3: VL Classifier Server (NEW in v1.4.0)

Deploy a Vision-Language classifier server with:

  • /api/status - Server health and capabilities
  • /api/classify - Document complexity classification
  • Optional /api/rerank - Direct reranking support

Example: deposium_embeddings-turbov2 with ResNet18 ONNX INT8 model

Usage

Basic Setup

  1. Add an Ollama Reranker node to your workflow
  2. Connect it to a Vector Store node (e.g., Pinecone, Qdrant, Supabase)
  3. Configure:
    • API Type: Choose between:
      • Ollama Generate API - Standard Ollama prompt-based
      • Custom Rerank API - Direct reranking endpoint
      • VL Classifier + Reranker - Vision-Language classification
      • Auto-Detect - Automatically detect server type
    • Model: Select a reranker model
    • Top K: Number of documents to return
    • Threshold: Minimum relevance score (0-1)
    • Base URL: URL to your server

VL Classifier Options (v1.4.0+)

When using VL Classifier:

  • Enable VL Classification: Use complexity analysis
  • Classification Strategy:
    • Metadata - Add complexity as document metadata
    • Filter - Filter by complexity before reranking
    • Both - Combine filtering and metadata
  • Filter Complexity: Keep LOW, HIGH, or both complexity documents

Example Workflow

User Query โ†’ Vector Store (retrieve 50 docs)
           โ†’ Ollama Reranker (rerank to top 10)
           โ†’ Continue with top-ranked documents

Supported Configurations

Reranker Models (Ollama/Custom API)

Model Size Speed Accuracy Best For
bge-reranker-v2-m3 ~600MB โšกโšกโšก โญโญโญโญ General purpose (Recommended)
Qwen3-Reranker-0.6B ~400MB โšกโšกโšกโšก โญโญโญ Low resource environments
Qwen3-Reranker-4B ~2.5GB โšกโšก โญโญโญโญ Balanced performance
Qwen3-Reranker-8B ~5GB โšก โญโญโญโญโญ Maximum accuracy

VL Classifier Models

Model Size Speed Use Case
ResNet18-ONNX-INT8 11MB โšกโšกโšกโšก Document complexity classification
Custom VL models Varies Varies Vision-Language tasks

Development

Setup

cd custom-nodes
npm install

Build

npm run build

Lint & Format

npm run lint        # Check linting
npm run lint:fix    # Auto-fix issues
npm run format      # Format code

Test

npm test

Local Testing

Use the included docker-compose.yml:

docker-compose up -d

Access n8n at http://localhost:5678 (admin/admin)

Git Hooks

Pre-commit hooks are configured with Husky to:

  • Run lint-staged (ESLint + Prettier)
  • Run qlty quality checks

Publishing

Publishing to npm is automated via GitHub Actions:

  1. Update version in custom-nodes/package.json
  2. Commit changes
  3. Create and push a tag:
git tag v1.0.1
git push origin v1.0.1

GitHub Actions will automatically:

  • Build the package
  • Run tests
  • Publish to npm

Project Structure

n8n_reranker/
โ”œโ”€โ”€ custom-nodes/              # Main npm package
โ”‚   โ”œโ”€โ”€ src/
โ”‚   โ”‚   โ””โ”€โ”€ nodes/
โ”‚   โ”‚       โ””โ”€โ”€ OllamaReranker/
โ”‚   โ”‚           โ”œโ”€โ”€ OllamaReranker.node.ts
โ”‚   โ”‚           โ”œโ”€โ”€ OllamaReranker.node.test.ts
โ”‚   โ”‚           โ””โ”€โ”€ ollama.svg
โ”‚   โ”œโ”€โ”€ .eslintrc.js
โ”‚   โ”œโ”€โ”€ .husky/                # Git hooks
โ”‚   โ”œโ”€โ”€ .prettierrc.json
โ”‚   โ”œโ”€โ”€ package.json           # Main package.json
โ”‚   โ”œโ”€โ”€ tsconfig.json
โ”‚   โ””โ”€โ”€ jest.config.js
โ”œโ”€โ”€ Dockerfile                 # For local development
โ”œโ”€โ”€ docker-compose.yml         # Complete dev environment
โ””โ”€โ”€ .github/
    โ””โ”€โ”€ workflows/
        โ””โ”€โ”€ npm-publish.yml    # Automated publishing

How It Works

The Reranking Challenge

Ollama doesn't natively support reranker models that output relevance scores. Instead, we implement three approaches:

  1. Prompt-based Scoring: Use Ollama's /api/generate with specially formatted prompts
  2. Custom Rerank API: Connect to servers with dedicated /api/rerank endpoints
  3. VL Classification: Pre-process with Vision-Language models for intelligent filtering

API Type Detection

The node automatically detects your server type by checking:

  1. /api/status โ†’ VL Classifier server
  2. /api/tags โ†’ Ollama server
  3. /api/rerank โ†’ Custom rerank server
  4. Fallback โ†’ Ollama (default)

Architecture

This node implements two n8n patterns:

Provider Node (OllamaReranker)

  1. No inputs - Provider nodes don't receive workflow data
  2. AiReranker output - Connects to Vector Store nodes
  3. supplyData() - Returns a reranker provider object
  4. Standard interfaces:
    • rerank() - Main reranking method
    • compressDocuments() - LangChain compatibility

Workflow Node (OllamaRerankerWorkflow)

  1. Main inputs/outputs - Processes workflow items
  2. execute() - Transforms documents in the workflow
  3. usableAsTool - Can be used as AI Agent tool

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run tests and linting
  5. Submit a pull request

Security

  • โœ… No vulnerabilities in dependencies
  • โœ… form-data updated to 4.0.4 (fixes CVE-2025-7783)
  • โœ… Code quality validated with qlty

License

MIT ยฉ Gabriel BRUMENT

Links

Support

For issues and feature requests, please use the GitHub Issues.

Discussion