Overview
The Scrappey node enables advanced web scraping and HTTP requests with built-in anti-bot protection bypass capabilities. It is designed to handle complex scenarios where websites employ CAPTCHA, Cloudflare, or other anti-bot measures that typically block automated requests. The node supports three main operation modes:
- Request Builder: Create customized HTTP or browser requests with detailed configuration options such as headers, cookies, proxies, and request types.
- HTTP Request • Auto-Retry on Protection: Automatically retries HTTP requests when blocked by anti-bot protections, resending the same payload and settings.
- Browser Request • Auto-Retry & Anti-Bot: Executes browser-based requests with anti-bot techniques like mouse movement emulation and challenge solving, retrying automatically if protection pages are encountered.
This node is beneficial for users needing reliable data extraction from protected websites, automating interactions with web services that have strict bot detection, or integrating resilient web requests into workflows.
Practical Examples
- Scraping product details from e-commerce sites protected by Cloudflare.
- Automating form submissions on websites with hCaptcha or reCAPTCHA challenges.
- Fetching API data behind anti-bot layers using browser emulation and proxy rotation.
Properties
| Name | Meaning |
|---|---|
| Scrappey Operations | Choose the mode of operation: - Request Builder: Build custom HTTP or browser requests. - HTTP Request • Auto-Retry on Protection: Retry HTTP requests blocked by anti-bot. - Browser Request • Auto-Retry & Anti-Bot: Browser-based requests with anti-bot. |
| URL | The target page URL to scrape (required for Request Builder). |
| HTTP Method | HTTP method to use (GET, POST, PUT, DELETE, PATCH, PUBLISH) for Request Builder. |
| Request Type | Type of request in Request Builder: - Browser - Request (standard HTTP) - Patched Chrome Browser |
| Which Proxy To Use | Select proxy source: - Proxy From Credentials - Proxy From HTTP Request Node - Proxy From Scrappey |
| Proxy Type | Type of proxy when using Scrappey proxy: - Residential proxy - Premium residential proxy - Datacenter proxy - Mobile proxy |
| Custom proxy country | Enable to specify a proxy country. |
| Custom Proxy Country | Select the country for the proxy to use (if enabled). |
| Custom proxy | When enabled, uses the proxy defined in credentials for this request (only for Request Builder with default proxy type). |
| Body OR Params? | For methods supporting body or params (POST, PUT, PATCH, DELETE, PUBLISH), select whether to send data in the body or as URL parameters. |
| Params | Parameters to include in the request URL (when "Params" selected). |
| Body | Request body content (when "Body" selected). |
| User Session | Identifier for user session to use in the request (optional). |
| Headers Input Method | Choose how to input headers: - Using Fields Below - Using JSON |
| Custom Headers | Define custom headers as key-value pairs (when using fields). |
| JSON Headers | Define custom headers as a JSON object string (when using JSON input). |
| One String Cookie | Use a single string format for cookies instead of key-value pairs. |
| Single String Cookie | Cookie string in "name=value; name2=value2" format (if one string cookie enabled). |
| Custom Cookies | Define cookies as key-value pairs (if one string cookie disabled). |
| Datadome | Enable bypass for Datadome protection (only for Browser request type). |
| Attempts | Number of attempts to make the request if it fails (1 to 3). |
| Antibot | Enable automatic solving of hCaptcha and reCAPTCHA challenges (only for Browser request type). |
| Add Random mouse movement | Simulate human interaction by adding random mouse movements during the browser session (only for Browser request type). |
| Record Video Session | Record a video of the browser session for debugging purposes (only for Browser request type). |
| CSS Selector | CSS selector to target specific elements on the page (only for Browser request type). |
| Href (Optional) | URL to navigate to when the CSS selector is used (only for Browser request type). |
| Intercept XHR/Fetch Request | Intercept and return data from a specific XHR/Fetch request URL instead of the main page content (only for Browser request type). |
Output
The node outputs an array of items, each containing a json field with the response data from the Scrappey API. The structure of the JSON depends on the operation mode and the response from the target website or API.
- For Request Builder, the output includes the HTTP response content, parsed as JSON if possible.
- For Auto-Retry HTTP or Browser modes, the output contains the final successful response after retries and anti-bot bypasses.
If the node handles binary data (e.g., screenshots or recorded videos), it would be included in the binary output fields, but this is not explicitly detailed here.
Dependencies
- Requires an active Scrappey API key credential configured in n8n.
- Uses the Scrappey API endpoint at
https://api.scrappey.com. - Optionally requires proxy credentials or proxy configurations depending on proxy usage.
- For browser-based operations, dependencies include headless browser environments managed by Scrappey.
- Optional features like CAPTCHA solving require appropriate service access via Scrappey.
Troubleshooting
Common Issues:
- Requests blocked due to missing or incorrect API key credential.
- Proxy misconfiguration leading to connection failures.
- Exceeding allowed number of attempts or rate limits from Scrappey API.
- Incorrectly formatted headers, cookies, or body causing request errors.
- Browser session failures if required browser environment is unavailable.
Error Messages:
- Authentication errors: Verify API key credential is correctly set.
- Proxy errors: Check proxy settings and ensure proxies are valid and reachable.
- CAPTCHA or anti-bot failures: Enable antibot or auto-retry features.
- Invalid input errors: Ensure all required fields like URL and HTTP method are properly filled.