Puppeteer icon

Puppeteer

Request a webpage using Puppeteer

Overview

The Get Page Content operation of this custom n8n node uses Puppeteer (with optional stealth mode) to fetch and render the HTML content of a web page. It allows you to specify advanced browser options, emulate devices, set headers, use proxies, and control navigation timing. This is particularly useful for scraping dynamic websites that require JavaScript execution or bypassing bot detection.

Common scenarios:

  • Scraping content from pages that require JavaScript rendering.
  • Extracting data from sites protected by anti-bot measures.
  • Automating website testing or monitoring changes in rendered HTML.

Practical examples:

  • Fetching product details from an e-commerce site that loads data dynamically.
  • Capturing the fully rendered HTML of a news article for further processing.
  • Integrating with APIs that require browser-based authentication flows.

Properties

Name Type Meaning
URL string The target web page address to fetch.
Query Parameters fixedCollection List of query parameters (name/value pairs) to append to the URL.
Options collection Advanced settings for browser behavior (see below).
└ Emulate Device options Emulates a specific device (e.g., iPhone, Pixel) for the browser session.
└ Executable path string Path to a custom Chromium/Chrome executable for Puppeteer to use.
└ Extra Headers fixedCollection Additional HTTP headers (name/value pairs) to send with the request.
└ File Name string If binary output is generated, sets the file name (not used in Get Page Content).
└ Launch Arguments fixedCollection Additional command-line arguments for launching the browser instance.
└ Timeout number Maximum navigation time in milliseconds (default: 30ms; 0 disables timeout).
└ Wait Until options Event to consider navigation complete: load, DOMContentLoaded, networkidle0, or networkidle2.
└ Page Caching boolean Enables/disables browser cache during navigation (default: enabled).
└ Headless mode boolean Runs browser in headless mode (no UI, default: true).
└ Stealth mode boolean Applies anti-detection techniques to make automation harder to detect (default: false).
└ Proxy Server string Proxy server configuration (e.g., localhost:8080, socks5://localhost:1080).

Output

The node outputs a single item per input, with the following json structure:

{
  "body": "<string>",          // The full HTML content of the fetched page
  "headers": { ... },          // Object containing response headers
  "statusCode": <number>       // HTTP status code returned by the server
}
  • No binary data is produced by this operation.

Dependencies

  • External Services: None required by default, but the node will access external web pages as specified by the URL property.
  • API Keys: Not required unless accessing protected resources.
  • Node.js Packages:
    • puppeteer-extra
    • puppeteer-extra-plugin-stealth
    • puppeteer
  • n8n Configuration:
    • Ensure the environment running n8n has access to install and run Puppeteer and its dependencies.
    • For custom Chrome/Chromium executables, ensure the path is accessible and compatible.

Troubleshooting

Common issues:

  • Timeouts: If the page takes too long to load, increase the "Timeout" option or set it to 0 to disable.
  • Blocked by anti-bot: Enable "Stealth mode" to reduce detection risk.
  • Invalid URL: Ensure the "URL" property is a valid, reachable address.
  • Proxy errors: Double-check proxy format and credentials if using "Proxy Server".
  • Missing browser executable: If specifying a custom "Executable path", verify the path is correct and points to a supported browser.

Error messages:

  • Request failed with status code <code>: The target server responded with a non-200 status. Check the URL, headers, and any required authentication.
  • Navigation timeout: The page did not finish loading within the specified timeout. Increase the timeout or check network conditions.
  • Cannot find module 'puppeteer': Ensure all required npm packages are installed in your n8n environment.

Links and References

Discussion