Puppeteer Cartier icon

Puppeteer Cartier

Automate browser interactions using Puppeteer

Overview

This node allows you to run custom JavaScript code using Puppeteer, a powerful browser automation library. It provides direct access to a headless (or optionally headed) Chromium browser instance through Puppeteer's API, enabling advanced web scraping, automated browsing, and interaction with web pages.

Typical use cases include:

  • Extracting data from websites that require JavaScript rendering.
  • Automating form submissions or navigation flows.
  • Taking screenshots or generating PDFs of web pages.
  • Running complex custom scripts that interact with page content dynamically.

For example, you can write a script to navigate to an IP lookup service, extract the IP address shown on the page, and return it as output. This flexibility makes it ideal for scenarios where standard HTTP requests are insufficient due to client-side rendering or interactive elements.

Properties

Name Meaning
Script Code JavaScript code to execute within the Puppeteer environment. You have access to $browser, $page, and $puppeteer objects representing the browser and page instances. The script must return an array of items.
Options Collection of various Puppeteer launch and page options:
- Batch Size Maximum number of pages to open simultaneously. Higher values increase resource usage.
- Browser WebSocket Endpoint WebSocket URL to connect to an existing browser instance instead of launching a new one.
- Browser WebSocket Endpoint Authorization Authorization header value used when connecting to the browser WebSocket endpoint.
- Emulate Device Select a device profile to emulate (e.g., iPhone, iPad).
- Executable Path Path to the Chromium/Chrome executable to use. Ignored if connecting via WebSocket.
- Extra Headers Custom HTTP headers to send with each request.
- File Name Filename to assign to binary outputs like screenshots or PDFs (not applicable to custom script operation).
- Launch Arguments Additional command line arguments passed to the browser process.
- Timeout Maximum navigation time in milliseconds (not applicable to custom script operation).
- Protocol Timeout Maximum time to wait for protocol responses in milliseconds.
- Wait Until Event to consider navigation successful (load, domcontentloaded, networkidle0, networkidle2). Not applicable to custom script operation.
- Page Caching Enable or disable page-level caching. Defaults to enabled.
- Headless mode Run browser in headless mode (no UI). Defaults to true.
- Use Chrome Headless Shell Run browser in headless shell mode (requires chrome-headless-shell in PATH). Requires headless mode enabled. Defaults to false.
- Stealth mode Apply techniques to make Puppeteer harder to detect as a bot. Defaults to false.
- Human typing mode Enables .typeHuman() function on pages to simulate human-like typing with configurable delays and typo chances. Defaults to false.
- Human Typing Options Configuration for human typing behavior such as delay ranges and typo probabilities. Only visible if human typing mode is enabled.
- Proxy Server Proxy server configuration string (e.g., localhost:8080, socks5://localhost:1080).
- Add Container Arguments Automatically add recommended arguments for running inside container environments (--no-sandbox, etc.). Defaults to true.

Output

The node outputs an array of items, each containing a json object with arbitrary keys and values as returned by your custom script. The script must return an array of objects, where each object represents one output item.

Example output structure from the default sample script:

[
  {
    "ip": "123.45.67.89",
    // ... any other properties merged from input JSON
  }
]

If your script produces binary data (e.g., screenshots or PDFs), those would be returned in the binary property of the output items, but this does not apply to the "Run Custom Script" operation specifically.

Dependencies

  • Puppeteer: The node uses Puppeteer with optional plugins for stealth mode and human typing simulation.
  • Chromium/Chrome Browser: Either bundled with Puppeteer or specified via executable path or connected via WebSocket endpoint.
  • Optional Plugins:
    • Stealth plugin to evade detection.
    • Human typing plugin to simulate realistic typing.
  • Node.js VM2 sandbox: Runs user-provided scripts securely with controlled access to Puppeteer objects.
  • n8n environment variables may influence behavior, e.g., enabling console output.

No internal credential names are exposed; however, if connecting to a remote browser via WebSocket, an authorization token may be required.

Troubleshooting

  • Common issues:

    • Browser launch failures: Ensure the executable path is correct or the WebSocket endpoint is reachable.
    • Timeouts: Navigation or protocol timeouts can occur if pages take too long to load or respond.
    • Script errors: Your custom script must return an array of items; otherwise, an error is thrown.
    • Resource limits: Opening many pages simultaneously (batch size) can exhaust memory or CPU.
    • Missing dependencies: If using headless shell mode, ensure chrome-headless-shell is installed and in your system PATH.
  • Error messages and resolutions:

    • "Custom script must return an array of items...": Make sure your script returns an array, e.g., return [{ key: value }];.
    • "Request failed with status code XXX": The page navigation failed; check the URL and network connectivity.
    • "Failed to launch/connect to browser: ...": Verify browser executable path, WebSocket URL, and authorization headers.
    • "Invalid URL: ...": The provided URL parameter is malformed; validate URLs before use.

Use console.log() statements inside your script to debug and view output in the n8n execution logs or UI.

Links and References

Discussion