Package Information
Documentation
n8n-nodes-anthropic-ratelimited
An n8n community node that wraps the Anthropic Claude chat model with built-in rate limit handling — designed for low-tier API plans (e.g. 30k TPM).
The Problem
The standard n8n Anthropic Chat Model node fails when hitting API rate limits (429 errors). The n8n retry mechanism is capped at 5 retries × 5 seconds — far too short for per-minute token rate limits where you need 60+ second waits.
The Solution
This node injects rate-limit-aware retry logic at the LangChain level (via onFailedAttempt), before errors bubble up to n8n's retry mechanism. Key features:
- Configurable wait on 429: Default 65 seconds, adjustable up to 300s
- Configurable max retries: Default 10, adjustable up to 50
- Proactive delay: Optional delay before every LLM call to stay under limits
- topP/temperature conflict fix: Handles the Anthropic API restriction automatically
Installation
Option A: Self-hosted (Cloud Run / Docker)
- Copy the built package into your n8n custom extensions directory:
# Build
cd n8n-nodes-anthropic-ratelimited
npm install
npm run build
# Copy to n8n's custom extensions
cp -r . /path-to-n8n/custom-extensions/n8n-nodes-anthropic-ratelimited
- Set the environment variable in your Docker/Cloud Run config:
N8N_CUSTOM_EXTENSIONS=/path-to-custom-extensions
- Restart n8n.
Option B: npm (if published)
cd ~/.n8n
npm install n8n-nodes-anthropic-ratelimited
Configuration
Rate Limit Settings
| Setting | Default | Description |
|---|---|---|
| Wait on Rate Limit (seconds) | 65 | How long to wait after a 429 error |
| Max Rate Limit Retries | 10 | How many 429 retries before giving up |
| Delay Between Calls (seconds) | 0 | Proactive delay before every LLM call |
Recommended Settings for 30k TPM
- Wait on Rate Limit: 65 seconds
- Max Rate Limit Retries: 10
- Delay Between Calls: 10-15 seconds (if you want to avoid 429s entirely)
Model Options
Same as the standard Anthropic node: Temperature, Top K, Top P, Max Tokens.
Usage
- Add "Anthropic Chat Model (Rate Limited)" as the model sub-node for your AI Agent
- Configure your Anthropic API credentials
- Set the rate limit parameters based on your API tier
- Connect to your AI Agent node as usual
The node is a drop-in replacement — your workflow structure stays identical.
How It Works
AI Agent calls LLM
→ proactive delay (if configured)
→ ChatAnthropic._generate()
→ Anthropic API
→ 429? → wait N seconds → retry (up to max)
→ 200? → return response to agent
The key insight: by handling retries inside LangChain's onFailedAttempt callback, the AI Agent node never sees the 429 error. From the agent's perspective, the LLM call just took longer.
License
MIT