anthropic-ratelimited

Anthropic Chat Model node for n8n with built-in rate limit handling (longer waits on 429 errors)

Package Information

Downloads: 1 weekly / 50 monthly
Latest Version: 0.1.3
Author: zindel.digital

Documentation

n8n-nodes-anthropic-ratelimited

An n8n community node that wraps the Anthropic Claude chat model with built-in rate limit handling — designed for low-tier API plans (e.g. 30k TPM).

The Problem

The standard n8n Anthropic Chat Model node fails when hitting API rate limits (429 errors). The n8n retry mechanism is capped at 5 retries × 5 seconds — far too short for per-minute token rate limits where you need 60+ second waits.

The Solution

This node injects rate-limit-aware retry logic at the LangChain level (via onFailedAttempt), before errors bubble up to n8n's retry mechanism. Key features:

  • Configurable wait on 429: Default 65 seconds, adjustable up to 300s
  • Configurable max retries: Default 10, adjustable up to 50
  • Proactive delay: Optional delay before every LLM call to stay under limits
  • topP/temperature conflict fix: Handles the Anthropic API restriction automatically

Installation

Option A: Self-hosted (Cloud Run / Docker)

  1. Copy the built package into your n8n custom extensions directory:
# Build
cd n8n-nodes-anthropic-ratelimited
npm install
npm run build

# Copy to n8n's custom extensions
cp -r . /path-to-n8n/custom-extensions/n8n-nodes-anthropic-ratelimited
  1. Set the environment variable in your Docker/Cloud Run config:
N8N_CUSTOM_EXTENSIONS=/path-to-custom-extensions
  1. Restart n8n.

Option B: npm (if published)

cd ~/.n8n
npm install n8n-nodes-anthropic-ratelimited

Configuration

Rate Limit Settings

Setting Default Description
Wait on Rate Limit (seconds) 65 How long to wait after a 429 error
Max Rate Limit Retries 10 How many 429 retries before giving up
Delay Between Calls (seconds) 0 Proactive delay before every LLM call

Recommended Settings for 30k TPM

  • Wait on Rate Limit: 65 seconds
  • Max Rate Limit Retries: 10
  • Delay Between Calls: 10-15 seconds (if you want to avoid 429s entirely)

Model Options

Same as the standard Anthropic node: Temperature, Top K, Top P, Max Tokens.

Usage

  1. Add "Anthropic Chat Model (Rate Limited)" as the model sub-node for your AI Agent
  2. Configure your Anthropic API credentials
  3. Set the rate limit parameters based on your API tier
  4. Connect to your AI Agent node as usual

The node is a drop-in replacement — your workflow structure stays identical.

How It Works

AI Agent calls LLM
    → proactive delay (if configured)
        → ChatAnthropic._generate()
            → Anthropic API
                → 429? → wait N seconds → retry (up to max)
                → 200? → return response to agent

The key insight: by handling retries inside LangChain's onFailedAttempt callback, the AI Agent node never sees the 429 error. From the agent's perspective, the LLM call just took longer.

License

MIT

Discussion