scrapix

n8n node for Scrapix web scraping API with support for scraping, collecting, crawling, and AI-powered extraction

Package Information

Downloads: 20 weekly / 202 monthly
Latest Version: 1.0.1
Author: Scrapix

Documentation

n8n-nodes-scrapix

This is an n8n community node that integrates the Scrapix web scraping API into n8n workflows.

n8n is a fair-code licensed workflow automation platform.

Scrapix is a powerful web scraping API that provides scraping, crawling, and AI-powered data extraction capabilities.

Features

This node provides four main operations:

🔍 Scrape

Scrape a single URL and return its content in multiple formats:

  • HTML, Markdown, Text
  • DOCX, PDF, Base64
  • Optional structured data extraction
  • Optional content summarization

📋 Collect

Discover and collect URLs from a page:

  • Regex-based path filtering (include/exclude)
  • Sitemap URL extraction
  • Configurable URL limits
  • Structured output (JSON, XML, YAML, TOML)

🕷️ Crawl

Crawl multiple URLs from a starting page:

  • All features from Collect
  • Returns content from each discovered URL
  • Multiple output formats supported

🤖 Extract

AI-powered extraction of structured data:

  • Natural language queries
  • Structured schema support
  • Content summarization
  • JSON, XML, YAML, TOML output

Installation

Manual Installation

For self-hosted n8n instances:

npm install n8n-nodes-scrapix

Discussion