Package Information
Downloads: 20 weekly / 202 monthly
Latest Version: 1.0.1
Author: Scrapix
Available Nodes
Documentation
n8n-nodes-scrapix
This is an n8n community node that integrates the Scrapix web scraping API into n8n workflows.
n8n is a fair-code licensed workflow automation platform.
Scrapix is a powerful web scraping API that provides scraping, crawling, and AI-powered data extraction capabilities.
Features
This node provides four main operations:
🔍 Scrape
Scrape a single URL and return its content in multiple formats:
- HTML, Markdown, Text
- DOCX, PDF, Base64
- Optional structured data extraction
- Optional content summarization
📋 Collect
Discover and collect URLs from a page:
- Regex-based path filtering (include/exclude)
- Sitemap URL extraction
- Configurable URL limits
- Structured output (JSON, XML, YAML, TOML)
🕷️ Crawl
Crawl multiple URLs from a starting page:
- All features from Collect
- Returns content from each discovered URL
- Multiple output formats supported
🤖 Extract
AI-powered extraction of structured data:
- Natural language queries
- Structured schema support
- Content summarization
- JSON, XML, YAML, TOML output
Installation
Manual Installation
For self-hosted n8n instances:
npm install n8n-nodes-scrapix