Scrapeless Official

Official Scrapeless nodes for n8n

Actions4

Deep SERPAPI Actions
- Google Search
Universal Scraping API Actions
- Web Unlocker
Crawler Actions
- Scrape
- Crawl

Overview

The "Universal Scraping API" node with the "Web Unlocker" operation is designed to bypass common web scraping protections and restrictions on target websites. It enables users to retrieve the fully rendered HTML content of a webpage, including JavaScript-rendered elements, by simulating a real browser environment. This is particularly useful for scraping data from sites that use anti-bot measures such as CAPTCHAs, IP blocking, or require JavaScript execution to load content.

Common scenarios where this node is beneficial include:

Extracting data from websites that heavily rely on client-side rendering.
Accessing content behind geo-restrictions by specifying a country proxy.
Avoiding detection by blocking unnecessary resource types like images or fonts.
Automating data collection from sites protected by anti-scraping technologies.

Practical example:

A user wants to scrape product details from an e-commerce site that loads prices dynamically via JavaScript and blocks requests from non-browser clients. Using this node with JS rendering enabled and headless browsing, the user can obtain the complete page content as seen in a real browser.

Properties

Name	Meaning
Target URL	The URL of the webpage to unlock and scrape. Must be a valid HTTP/HTTPS address.
Js Render	Whether to enable JavaScript rendering on the page. When true, the node will execute JavaScript to render dynamic content before returning the result.
Headless	Whether to run the browser in headless mode (without a visible UI). Typically set to true for automated scraping tasks.
Country	The geographic location from which the request should appear to originate. Useful for bypassing geo-blocks or accessing region-specific content. Options include many countries worldwide and "World Wide" (ANY) for no restriction.
Js Instructions	JSON array of instructions for controlling the JavaScript rendering process, such as waiting times or custom scripts to execute during page load. Default is to wait 100 milliseconds.
Block	JSON object specifying resources and URLs to block during page loading to speed up scraping and reduce bandwidth. For example, blocking images, fonts, and scripts from specified URLs.

Output

The node outputs a JSON object containing the scraped webpage data after unlocking it. The exact structure depends on the response from the Universal Scraping API but typically includes:

The full HTML content of the unlocked page.
Metadata about the request or response.
Any extracted data if applicable.

If binary data is returned (e.g., screenshots or files), it would be included in the binary output field, representing the raw data fetched from the target URL.

Dependencies

Requires an API key credential for the Scrapeless service to authenticate requests.
Depends on the Universal Scraping API service provided by Scrapeless.
Network access to the target URLs and possibly proxy servers depending on the selected country option.
n8n environment must have internet connectivity and proper configuration to use external APIs.

Troubleshooting

Common issues:
- Invalid or missing API credentials will cause authentication failures.
- Incorrect or malformed Target URL may lead to request errors.
- Selecting a country with restricted access or unavailable proxies might result in blocked requests.
- Improperly formatted JSON in Js Instructions or Block properties can cause parsing errors.
Error messages:
- "Unsupported resource": Occurs if the Resource parameter is not set to "universalScrapingApi".
- API errors related to rate limits or invalid parameters will be returned from the Scrapeless service; check your API usage and input values.
- Timeout or network errors may happen if the target website is unreachable or slow to respond.
Resolutions:
- Verify API key validity and permissions.
- Ensure URLs are correct and accessible.
- Adjust Js Instructions to allow sufficient time for page rendering.
- Use the Block property to disable loading heavy resources that may slow down or block scraping.