cascadeflow

n8n node for cascadeflow - Smart AI model cascading with 40-85% cost savings

Package Information

Downloads: 298 weekly / 1,405 monthly

Latest Version: 1.1.0

Author: Lemony Inc.

Available Nodes

CascadeFlow

Smart AI model cascading with 40-85% cost savings. Supports 16 domains with domain-specific model routing.

CascadeFlow Agent

CascadeFlow AI Agent with drafter/verifier orchestration, tool routing, domain routing, and trace metadata.

Documentation

@cascadeflow/n8n-nodes-cascadeflow

n8n community node for cascadeflow

Intelligent AI model cascading for n8n workflows with domain understanding.

cascadeflow Domain Routing

This package provides two nodes for n8n workflows:

Node	Type	Use case
CascadeFlow (Model)	Language Model sub-node	Drop-in replacement for any AI Chat Model. Wire into Basic LLM Chain, Chain, or any node that accepts a Language Model.
CascadeFlow Agent	Standalone agent node	Full agent with tool calling, memory, and multi-step reasoning. Wire directly into workflows like Chat Trigger → Agent → response.

Both nodes share the same cascade engine: try a cheap drafter first, validate quality, escalate to a verifier only when needed. 40-85% cost savings.

n8n is a fair-code licensed workflow automation platform.

Installation

Follow the installation guide in the n8n community nodes documentation.

Community Nodes (Recommended)

Go to Settings > Community Nodes
Select Install
Enter @cascadeflow/n8n-nodes-cascadeflow in Enter npm package name
Agree to the risks and install

Manual installation

npm install @cascadeflow/n8n-nodes-cascadeflow

For Docker-based deployments add the following line before the font installation command in your n8n Dockerfile:

RUN cd /usr/local/lib/node_modules/n8n && npm install @cascadeflow/n8n-nodes-cascadeflow

Node 1: CascadeFlow (Model)

A Language Model sub-node (ai_languageModel output) that acts as a drop-in cascading wrapper around two models.

When to use

You want to plug cascadeflow into an existing chain or LLM node
No tool calling or memory needed
Works with: Basic LLM Chain, Chain, Question and Answer Chain, Summarization Chain, and any node that accepts a Language Model input

Architecture

┌─────────────┐
│  Drafter    │ (e.g., Claude Haiku, GPT-4o-mini)
└──────┬──────┘
       │
       ├──────► ┌──────────────┐
       │        │  CascadeFlow │
       │        │  (Model)     │ ────► ┌──────────────┐
       │        └──────────────┘       │ Basic Chain  │
       │        Quality checks         │ Chain        │
       │        Cascades if needed     │ & more       │
       │                                └──────────────┘
┌──────┴──────┐
│  Verifier   │ (e.g., Claude Sonnet, GPT-4o)
└─────────────┘

Inputs

Port	Type	Required	Description
Verifier	`ai_languageModel`	Yes	Powerful model used when drafter quality is too low
Drafter	`ai_languageModel`	Yes	Cheap/fast model tried first
Domain models	`ai_languageModel`	No	Appear when domain cascading is enabled

Output

Port	Type	Description
Model	`ai_languageModel`	Language Model connection for downstream chain/LLM nodes

Parameters

Parameter	Default	Description
Quality Threshold	0.4	Minimum quality score (0-1) to accept drafter response
Use Complexity Thresholds	true	Per-complexity confidence thresholds (trivial→expert)
Enable Alignment Scoring	true	Score query-response alignment for better validation
Enable Complexity Routing	true	Route complex queries directly to verifier
Enable Domain Cascading	false	Detect query domain and route to specialized models

Quick Start

┌──────────────────┐
│ When chat        │
│ message received │
└────────┬─────────┘
         │
         v
┌──────────────────┐       ┌──────────────────┐
│  OpenAI Model    │──────►│                  │
│  gpt-4o-mini     │       │  CascadeFlow     │       ┌──────────────────┐
└──────────────────┘       │  (Model)         │──────►│ Basic LLM Chain  │
                           │                  │       │                  │
┌──────────────────┐       │  Threshold: 0.4  │       └──────────────────┘
│  OpenAI Model    │──────►│                  │
│  gpt-4o          │       └──────────────────┘
└──────────────────┘

Node 2: CascadeFlow Agent

A standalone agent node (main in/out) with its own agent loop, tool calling, memory, and per-tool cascade/verifier routing.

When to use

You need tool calling with cascade-aware routing
You want memory (conversation history) built in
You want to wire directly into a workflow (Chat Trigger → Agent → response)
You need per-tool routing rules (force verifier after specific tools)
You need tool call validation (drafter tool calls verified before execution)

Architecture

┌──────────────────┐
│ Chat Trigger     │
│ or any node      │
└────────┬─────────┘
         │ (main)
         v
┌──────────────────────────────────────────┐
│            CascadeFlow Agent             │
│                                          │
│  ┌─────────┐  ┌─────────┐  ┌──────────┐│
│  │ Verifier│  │ Drafter │  │ Memory   ││
│  └────┬────┘  └────┬────┘  └────┬─────┘│
│       │            │            │       │
│  ┌────┴────────────┴────┐       │       │
│  │  Cascade Engine      │◄──────┘       │
│  │  + Agent Loop        │               │
│  └──────────┬───────────┘               │
│             │                           │
│  ┌──────────┴───────────┐               │
│  │  Tools               │               │
│  └──────────────────────┘               │
└──────────────────┬───────────────────────┘
                   │ (main)
                   v
┌──────────────────┐
│ Next node        │
│ (response, etc.) │
└──────────────────┘

Inputs

Port	Type	Required	Description
(main)	`main`	Yes	Workflow items from upstream node (e.g., Chat Trigger)
Verifier	`ai_languageModel`	Yes	Powerful model for verification and escalation
Drafter	`ai_languageModel`	Yes	Cheap/fast model tried first
Memory	`ai_memory`	No	Chat memory (e.g., Window Buffer Memory) for conversation history
Tools	`ai_tool`	No	Up to 99 tools for the agent to call
Domain models	`ai_languageModel`	No	Appear when domain cascading is enabled

Output

Port	Type	Description
Output	`main`	Workflow items with `output`, cascade metadata, and `trace`

The output JSON for each item contains:

{
  "output": "The agent's final response text",
  "model_used": "gpt-4o-mini",
  "domain": "code",
  "confidence": 0.85,
  "trace": [
    { "model_used": "gpt-4o-mini", "tool_calls": ["search"] },
    { "model_used": "gpt-4o", "tool_calls": [] }
  ]
}

Parameters

Parameter	Default	Description
System Message	(empty)	System prompt for the agent
Text	`{{ $json.chatInput }}`	User input message. Auto-wires with Chat Trigger.
Quality Threshold	0.4	Minimum quality score to accept drafter response
Use Complexity Thresholds	true	Per-complexity confidence thresholds
Enable Tool Call Validation	true	Validate drafter tool calls before execution; re-generate with verifier on failure
Max Tool Iterations	3	Maximum tool-call loop iterations
Tool Routing Rules	(none)	Per-tool routing overrides (cascade or force verifier)
Enable Domain Cascading	false	Domain-specific model routing

Quick Start

┌──────────────────┐
│ Chat Trigger     │
└────────┬─────────┘
         │
         v
┌──────────────────────────────────────────┐
│            CascadeFlow Agent             │
│                                          │
│  Claude Haiku ──► Drafter                │
│  Claude Sonnet ─► Verifier               │       ┌──────────────────┐
│  Window Buffer ─► Memory                 │──────►│  Respond to      │
│  HTTP Request ──► Tool                   │       │  Webhook         │
│  Calculator ────► Tool                   │       └──────────────────┘
└──────────────────────────────────────────┘

Tool Routing Rules

Override cascade behavior for specific tools:

Routing	Behavior
Cascade (default)	Drafter generates tool calls, cascade validates
Verifier	After this tool executes, the verifier generates the final response

Use verifier routing for high-stakes tools (e.g., database writes, payment APIs) where you want the powerful model to interpret results.

Tool Call Validation

When enabled (default), the agent validates drafter-generated tool calls before executing them:

JSON syntax check
Schema validation
Safety checks

If validation fails, tool calls are re-generated by the verifier model, preventing malformed or unsafe tool invocations.

Shared Features

Both nodes share these capabilities:

Cascade Flow

Query goes to cheap drafter model first
cascadeflow validates the response quality
If quality passes → return drafter response (fast + cheap)
If quality fails → escalate to verifier model (slower but accurate)

Result: 70-80% of queries accept the drafter, saving 40-85% on costs.

Multi-Domain Cascading (Optional)

Both nodes support domain-specific cascading. Enable it in the node settings to automatically detect query domains and route to specialized models.

Supported domains:

Domain	Description	Example Queries
Code	Programming, debugging, code generation	"Write a Python function...", "Debug this code..."
Math	Mathematical reasoning, calculations, proofs	"Solve this equation...", "Prove that..."
Data	Data analysis, statistics, pandas/SQL	"Analyze this dataset...", "Write a SQL query..."
Creative	Creative writing, stories, poetry	"Write a short story...", "Compose a poem..."
Legal	Legal documents, contracts, regulations	"Draft a contract...", "Explain this law..."
Medical	Healthcare, medical knowledge, clinical	"What are the symptoms of...", "Explain this diagnosis..."
Financial	Finance, accounting, investment analysis	"Analyze this stock...", "Calculate ROI..."
Science	Scientific knowledge, research, experiments	"Explain quantum...", "How does photosynthesis..."
Structured	JSON, XML, structured output	"Generate a JSON schema..."
RAG	Retrieval-augmented generation	"Based on the document..."
Conversation	General chat, small talk	"How are you?", "Tell me about..."
Tool	Tool-oriented queries	"Search for...", "Calculate..."
Summary	Summarization tasks	"Summarize this article..."
Translation	Language translation	"Translate to French..."
Multimodal	Image/audio/video queries	"Describe this image..."
General	Catch-all domain	Everything else

Setup:

Enable Domain Cascading in node settings
Toggle individual domains
Connect domain-specific models to the new input ports
Optionally enable domain verifiers to override the global verifier per domain

Complexity Thresholds

When enabled (default), acceptance is driven by query complexity:

Complexity	Default Threshold
Trivial	0.25
Simple	0.40
Moderate	0.55
Hard	0.70
Expert	0.80

Flow Visualization

Viewing Cascade Decisions in Real-Time

cascadeflow provides detailed logging of every cascade decision in n8n's UI:

Execute your workflow
For CascadeFlow (Model): Click the downstream Chain node → "Logs" tab
For CascadeFlow Agent: Click the Agent node → "Output" tab (trace is in the output JSON)

Example log output:

CascadeFlow: Trying drafter model...
   Quality validation: confidence=0.85, method=heuristic
   Alignment: 0.82

   FLOW: DRAFTER ACCEPTED (FAST PATH)
   Query -> Drafter -> Quality Check -> Response
   Confidence: 0.85 (threshold: 0.70)
   Cost savings: ~93.8% (used cheap model)

Recommended Model Configurations

Claude Haiku + GPT-4o (Recommended)

Drafter: claude-3-5-haiku-20241022
Verifier: gpt-4o
Savings: ~73% average
Best for: General purpose, coding, reasoning

Anthropic Only (High Quality)

Drafter: claude-3-5-haiku-20241022
Verifier: claude-3-5-sonnet-20241022
Savings: ~70% average

OpenAI Only (Good Balance)

Drafter: gpt-4o-mini
Verifier: gpt-4o
Savings: ~85% average

Ultra Fast with Ollama (Local)

Drafter: ollama/qwen2.5:3b (local)
Verifier: gpt-4o (cloud)
Savings: ~99% on drafter calls (no API cost)
Note: Requires Ollama installed locally

Troubleshooting

"Drafter model is required"

Make sure you've connected an AI Chat Model to the Drafter input port.

"Verifier model is required"

Make sure you've connected an AI Chat Model to the Verifier input port.

Not seeing cascade logs

CascadeFlow (Model): Logs appear in the downstream Chain node's "Logs" tab, not the cascadeflow node itself.
CascadeFlow Agent: Cascade metadata and trace are in the output JSON of the Agent node.

Always escalating to verifier

Try lowering the Quality Threshold (0.3-0.4)
Verify your drafter model is actually a cheaper/faster model
Check logs for the confidence scores being reported

"This node cannot be connected"

Use CascadeFlow (Model) with Chain/LLM nodes that accept Language Model inputs
Use CascadeFlow Agent for standalone agent workflows with tool calling and memory

Compatibility

n8n version: 1.0+
Works with any AI Chat Model node in n8n:
- OpenAI Chat Model
- Anthropic Chat Model
- Ollama Chat Model
- Azure OpenAI Chat Model
- Google PaLM Chat Model
- And more...

Resources

License

MIT

Version History

v1.0.0 (Latest)

CascadeFlow Agent → standalone node: Converted from supplyData() sub-node to execute() node with main in/out
Memory support: Added ai_memory input for conversation history (Window Buffer Memory, etc.)
System message & text params: Agent node now has its own system prompt and text input (defaults to {{ $json.chatInput }})
Direct workflow wiring: Chat Trigger → CascadeFlow Agent → response, no intermediate Chain node needed

v0.7.x

Domain cascading labels: Shortened domain input labels, section dividers, tool call validation on by default
Single getInputConnectionData call: Correct model resolution and n8n highlighting

v0.6.x

Multi-domain cascading: 16-domain intelligent routing with individual toggles and dynamic input ports
Removed semantic validation: Disabled ML-based semantic validation to prevent OOM crashes
Circuit breaker: Added circuit breaker pattern for improved reliability

v0.5.0

Flow visualization: Detailed cascade flow logging in n8n Logs tab
Quality validator integration: Integrated QualityValidator from @cascadeflow/core
Complexity-aware validation: Replacing naive length-based checks

v0.4.x and earlier

Initial releases as LangChain sub-node
Support for any AI Chat Model in n8n
Lazy verifier loading
Quality threshold configuration

cascadeflowInstall

Package Information

Available Nodes

Documentation

@cascadeflow/n8n-nodes-cascadeflow

Installation

Community Nodes (Recommended)

Manual installation

Node 1: CascadeFlow (Model)

When to use

Architecture

Inputs

Output

Parameters

Quick Start

Node 2: CascadeFlow Agent

When to use

Architecture

Inputs

Output

Parameters

Quick Start

Tool Routing Rules

Tool Call Validation

Shared Features

Cascade Flow

Multi-Domain Cascading (Optional)

Complexity Thresholds

Flow Visualization

Viewing Cascade Decisions in Real-Time

Recommended Model Configurations

Claude Haiku + GPT-4o (Recommended)

Anthropic Only (High Quality)

OpenAI Only (Good Balance)

Ultra Fast with Ollama (Local)

Troubleshooting

"Drafter model is required"

"Verifier model is required"

Not seeing cascade logs

Always escalating to verifier

"This node cannot be connected"

Compatibility

Resources

License

Version History

v1.0.0 (Latest)

v0.7.x

v0.6.x

v0.5.0

v0.4.x and earlier

Discussion

cascadeflow