Package Information
Downloads: 0 weekly / 15 monthly
Latest Version: 1.4.0
Author: NAF
Documentation
Crawl4AI n8n Nodes
Advanced web crawling, data extraction, and interaction nodes for n8n with LLM capabilities.
Installation
npm install n8n-nodes-crawl4ai_naf
Features
Main Crawl4ai Node
- Basic Crawling: Simple web page crawling with markdown/HTML extraction
- CSS Extraction: Extract structured data using CSS selectors
- LLM Extraction: Use LLM for complex data extraction
- Batch Processing: Process multiple URLs concurrently
- Anti-Detection: Undetected browser mode, stealth mode, CAPTCHA bypass
Crawl4ai Interaction Node
- Element Interaction: Click buttons, fill forms, handle dropdowns
- Authentication: Login form handling and session management
- LLM Prompts: Automate interactions using natural language prompts
- Multi-Step Workflows: Complex interaction sequences
Usage Examples
Basic Crawling
{
"nodes": [
{
"parameters": {
"operation": "basic_crawl",
"urlConfig": {
"urls": [
{
"url": "https://example.com"
}
]
},
"browserConfig": {
"settings": {
"headless": true,
"viewportWidth": 1920,
"viewportHeight": 1080
}
}
},
"name": "Crawl4ai",
"type": "n8n-nodes-crawl4ai_naf.crawl4ai",
"typeVersion": 1,
"position": [250, 300]
}
]
}
Advanced Crawling with Authentication
{
"nodes": [
{
"parameters": {
"operation": "css_extraction",
"urlConfig": {
"urls": [
{
"url": "https://protected.example.com/dashboard"
},
{
"url": "https://protected.example.com/reports"
}
]
},
"browserConfig": {
"settings": {
"headless": true,
"viewportWidth": 1920,
"viewportHeight": 1080
}
},
"antiDetection": {
"settings": {
"undetected": true,
"stealth": true,
"captchaBypass": "2captcha"
}
},
"authConfig": {
"authSettings": {
"enableAuth": true,
"authType": "form",
"username": "your_username",
"password": "your_password",
"loginUrl": "https://protected.example.com/login"
}
},
"advancedConfig": {
"advancedSettings": {
"maxRetries": 3,
"timeout": 30000,
"concurrentRequests": 2,
"debugMode": true
}
}
},
"name": "Crawl4ai",
"type": "n8n-nodes-crawl4ai_naf.crawl4ai",
"typeVersion": 1,
"position": [250, 300]
}
]
}
LLM Extraction Example
{
"nodes": [
{
"parameters": {
"operation": "llm_extraction",
"urlConfig": {
"urls": [
{
"url": "https://complex-data.example.com"
}
]
},
"browserConfig": {
"settings": {
"headless": true
}
}
},
"name": "Crawl4ai",
"type": "n8n-nodes-crawl4ai_naf.crawl4ai",
"typeVersion": 1,
"position": [250, 300]
}
]
}
LLM Prompt Interaction
{
"nodes": [
{
"parameters": {
"interactionType": "llm_prompt",
"llmPromptConfig": {
"promptSettings": {
"promptText": "Find the login form, fill username with 'testuser' and password with 'testpass', then click the submit button",
"provider": "openai/gpt-4",
"maxTokens": 1000
}
}
},
"name": "Crawl4aiInteraction",
"type": "n8n-nodes-crawl4ai_naf.crawl4aiInteraction",
"typeVersion": 1,
"position": [250, 300]
}
]
}
Element Click Interaction
{
"nodes": [
{
"parameters": {
"interactionType": "element_click",
"elementConfig": {
"clickSettings": {
"selector": "#submit-button",
"waitAfterClick": 2000
}
}
},
"name": "Crawl4aiInteraction",
"type": "n8n-nodes-crawl4ai_naf.crawl4aiInteraction",
"typeVersion": 1,
"position": [450, 300]
}
]
}
Complete Workflow Example
{
"nodes": [
{
"parameters": {
"operation": "basic_crawl",
"urlConfig": {
"urls": [
{
"url": "https://example.com/login"
}
]
}
},
"name": "Crawl4ai",
"type": "n8n-nodes-crawl4ai_naf.crawl4ai",
"typeVersion": 1,
"position": [250, 300]
},
{
"parameters": {
"interactionType": "authentication",
"authConfig": {
"authSettings": {
"username": "user@example.com",
"password": "password123",
"loginUrl": "https://example.com/login"
}
}
},
"name": "Crawl4aiInteraction",
"type": "n8n-nodes-crawl4ai_naf.crawl4aiInteraction",
"typeVersion": 1,
"position": [450, 300]
},
{
"parameters": {
"operation": "css_extraction",
"urlConfig": {
"urls": [
{
"url": "https://example.com/dashboard"
}
]
}
},
"name": "Crawl4ai2",
"type": "n8n-nodes-crawl4ai_naf.crawl4ai",
"typeVersion": 1,
"position": [650, 300]
}
],
"connections": {
"Crawl4ai": {
"main": [
[
{
"node": "Crawl4aiInteraction",
"type": "main",
"index": 0
}
]
]
},
"Crawl4aiInteraction": {
"main": [
[
{
"node": "Crawl4ai2",
"type": "main",
"index": 0
}
]
]
}
}
}
Configuration
Browser Configuration
- Headless Mode: Run browser in headless mode (default: true)
- Viewport: Set browser viewport dimensions (default: 1920x1080)
- User Agent: Custom user agent string
- Proxy Support: Configure proxy settings
Anti-Detection Settings
- Undetected Mode: Enable undetected browser mode
- Stealth Mode: Enable stealth mode with fingerprint masking
- CAPTCHA Bypass: Configure CAPTCHA bypass strategies (2Captcha, Anti-Captcha, Custom)
- Behavioral Simulation: Simulate human-like interactions
Authentication Options
- Basic Auth: Username/password authentication
- Form Auth: Form-based authentication with login URL
- OAuth2: OAuth2 token-based authentication
- API Key: API key authentication
- Session Cookie: Session cookie authentication
Advanced Configuration
- Max Retries: Maximum number of retry attempts (default: 3)
- Timeout: Request timeout in milliseconds (default: 30000)
- Concurrent Requests: Number of concurrent requests (default: 5)
- Debug Mode: Enable debug logging (default: false)
Development
Prerequisites
- Node.js 18+
- npm 9+
- TypeScript 5+
Building
npm install
npm run build
Testing
npm run test
Publishing
npm publish
Error Handling
Both nodes include comprehensive error handling and validation:
- Input data validation
- URL format validation
- Configuration parameter validation
- Authentication credential validation
- Proper error messages and timestamps
Support
For issues, questions, or contributions, please contact: contact@nafer.ru
License
MIT