headlessx

n8n community node for HeadlessX v2 API - anti-detection web scraping with Camoufox

Package Information

Downloads: 23 weekly / 372 monthly
Latest Version: 2.0.1
Author: SaifyXPRO

Documentation

n8n-nodes-headlessx

HeadlessX Logo

🚀 n8n community node for HeadlessX v2 - Anti-detection web scraping with Camoufox

npm version
npm downloads
License
GitHub stars

n8n Community Node
Production Ready
TypeScript
Camoufox

📚 Documentation🚀 Quick Start✨ Features📦 Installation🔧 Configuration


🚀 About HeadlessX v2

HeadlessX v2 is a next-generation stealth web scraping API powered by Camoufox - an undetectable browser that bypasses anti-bot systems.

🎯 Core Capabilities

Feature Description Use Cases
🦊 Camoufox Engine Undetectable Firefox-based browser Bot detection bypass
🔍 Google SERP Extract search results with anti-detection SEO monitoring, search analysis
📄 HTML Extraction Fast raw HTML or JS-rendered content Web scraping, data mining
📝 Content Extraction Clean Markdown from any page Content analysis, text processing
📸 Screenshots High-quality page captures Visual testing, documentation

⚠️ Important: HeadlessX runs as a separate API server. This n8n node is a client that connects to your HeadlessX instance.

🔗 Get HeadlessX: github.com/SaifyXPRO/HeadlessX

✨ What's New in v2.0

🚨 Major Version Update - Breaking Changes

Change Before (v1.x) After (v2.0)
API Paths /api/html /api/website/html
Operations 8 operations 5 streamlined operations
Methods GET + POST duplicates POST only (simplified)
New Features - Google SERP, HTML-JS rendering
Removed PDF, Batch, Render Not in v2 API

🔧 v2.0 Operations

Operation Endpoint Description
📄 Extract HTML POST /api/website/html Fast raw HTML extraction
📄 Extract HTML (JS) POST /api/website/html-js HTML with JavaScript rendering
📝 Extract Content POST /api/website/content Clean Markdown content
📸 Screenshot POST /api/website/screenshot High-quality page captures
🔍 Google SERP POST /api/google-serp/search Google search results extraction

🚀 Quick Start

📋 Prerequisites

Requirement Version Installation
HeadlessX Server v2.0+ Install Guide
n8n 1.0.0+ n8n Documentation
Node.js 18+ nodejs.org

⚡ 30-Second Setup

  1. Install HeadlessX Server:

    git clone https://github.com/SaifyXPRO/HeadlessX.git
    cd HeadlessX && pnpm install && pnpm dev
    
  2. Install n8n Community Node:

    • Go to SettingsCommunity Nodes in n8n
    • Enter: n8n-nodes-headlessx
    • Click Install
  3. Configure Credentials:

    • Create new HeadlessX API credential
    • Base URL: http://localhost:3000
    • API Token: Your token
  4. Test Connection:

    • Add HeadlessX node to workflow
    • Select any operation and test

📦 Installation

🎯 Installation Options

📱 Option 1: n8n Community Nodes (Recommended)
  1. Navigate to SettingsCommunity Nodes in your n8n instance
  2. Click Install a community node
  3. Enter package name: n8n-nodes-headlessx
  4. Click Install and wait for completion
  5. Restart n8n if required
📦 Option 2: npm Installation
# Global installation
npm install -g n8n-nodes-headlessx

# Local installation (for self-hosted n8n)
npm install n8n-nodes-headlessx
🐳 Option 3: Docker Setup
FROM n8nio/n8n:latest
USER root
RUN npm install -g n8n-nodes-headlessx
USER node

Docker Compose Example:

version: '3.8'
services:
  headlessx:
    build: ./HeadlessX
    ports: ["3000:3000"]
    environment:
      - DATABASE_URL=postgresql://...
    restart: unless-stopped

  n8n:
    image: n8nio/n8n:latest
    ports: ["5678:5678"]
    volumes: ["n8n_data:/home/node/.n8n"]
    depends_on: [headlessx]
    restart: unless-stopped

volumes:
  n8n_data:

🔧 Configuration

🔐 Setting Up Credentials

Field Description Example Required
Base URL HeadlessX server endpoint http://localhost:3000
API Token Authentication token your-secret-token

🔒 Authentication Methods

Method Format Auto-Applied
Query Parameter ?token=your-token
Header Authentication X-Token: your-token

🎯 Available Operations

📊 v2.0 Operations

📄 Extract HTML

Endpoint: POST /api/website/html

Extract raw HTML content from any webpage quickly without JavaScript rendering.

Parameters:

Option Description Default
URL Target webpage URL Required
Timeout Request timeout (ms) 30000
Wait Until Page load condition load
Headers Custom HTTP headers -
User Agent Custom user agent -

Use Cases:

  • Simple page scraping
  • Static content extraction
  • Fast bulk operations
📄 Extract HTML (JS Rendered)

Endpoint: POST /api/website/html-js

Extract HTML with full JavaScript rendering for SPAs and dynamic content.

Parameters:

Option Description Default
URL Target webpage URL Required
Timeout Request timeout (ms) 30000
Extra Wait Additional wait time after load 0
Wait Until Page load condition networkidle0

Use Cases:

  • Single Page Applications (SPAs)
  • React/Vue/Angular sites
  • Dynamic content extraction
📝 Extract Content

Endpoint: POST /api/website/content

Extract clean, readable Markdown content from any webpage.

Parameters:

Option Description Default
URL Target webpage URL Required
Timeout Request timeout (ms) 30000
Wait Until Page load condition load

Use Cases:

  • Article extraction
  • Content analysis
  • Text processing
  • AI/LLM data preparation
📸 Take Screenshot

Endpoint: POST /api/website/screenshot

Capture high-quality screenshots of webpages.

Parameters:

Option Description Default
URL Target webpage URL Required
Full Page Capture entire page true
Format PNG, JPEG, WebP png
Quality Image quality (1-100) 80
Wait for Selector CSS selector to wait for -

Use Cases:

  • Visual regression testing
  • Website monitoring
  • Documentation
  • Social media content
🔍 Google SERP Search

Endpoint: POST /api/google-serp/search

Extract Google search results with advanced anti-detection.

Parameters:

Option Description Default
Query Search query Required
Number of Results Results to return 10
Language Search language en
Country Result localization us
Safe Search Safety filter level off

Use Cases:

  • SEO monitoring
  • Competitor analysis
  • Keyword research
  • Search result tracking

💡 Example Workflows

🚀 Quick Start Examples

1. 🕷️ Simple Web Scraping
graph LR
    A[Manual Trigger] --> B[HeadlessX: Extract HTML]
    B --> C[Code Node: Process HTML]
    C --> D[Output Results]

Configuration:

{
  "operation": "html",
  "url": "https://example.com",
  "htmlOptions": {
    "timeout": 30000,
    "waitUntil": "networkidle2"
  }
}
2. 🔍 Google SERP Monitoring
graph LR
    A[Schedule Trigger] --> B[HeadlessX: Google SERP]
    B --> C[Store Results]
    C --> D[Alert on Changes]

Configuration:

{
  "operation": "googleSerp",
  "query": "your keyword",
  "serpOptions": {
    "num": 20,
    "hl": "en",
    "gl": "us"
  }
}
3. 📸 Website Monitoring
graph LR
    A[Schedule Trigger] --> B[HeadlessX: Screenshot]
    B --> C[Compare Images]
    C --> D[Send Alert]

Configuration:

{
  "operation": "screenshot",
  "url": "https://your-website.com",
  "screenshotOptions": {
    "fullPage": true,
    "format": "png"
  }
}

🚨 Troubleshooting

🔍 Common Issues & Solutions

❌ Connection Issues

"Couldn't connect with these settings"

Check Solution
Server Running curl http://localhost:3000/api/health
URL Format Use http://localhost:3000 (no /api)
Network Access Check firewall/Docker networking
Token Validity Verify API token is correct
⏱️ Timeout Issues

"Request timeout" errors

Cause Solution
Slow Page Load Increase timeout to 60000ms+
Dynamic Content Use htmlJs operation with extraWait
Heavy Resources Use domcontentloaded wait condition

🤝 Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

📄 License

MIT License - see LICENSE for details.


Made with ❤️ by SaifyXPRO

GitHub
npm

Discussion