Overview
This n8n node provides advanced browser automation capabilities using Puppeteer. It allows you to control a headless (or full) browser to perform tasks such as loading web pages, taking screenshots, generating PDFs, scraping content, and running custom scripts in the browser context. The node is highly configurable, supporting device emulation, proxy settings, stealth mode, human-like typing simulation, and more.
Common scenarios:
- Automated website testing or monitoring.
- Web scraping and data extraction from dynamic sites.
- Generating screenshots or PDFs of web pages for reporting or archiving.
- Running custom JavaScript in the context of a loaded page.
- Bypassing anti-bot measures with stealth and human typing plugins.
Practical examples:
- Capture daily screenshots of a dashboard for archival.
- Scrape product prices from an e-commerce site that requires JavaScript rendering.
- Generate PDFs of invoices from a web application.
- Automate login and navigation flows for testing purposes.
Properties
Below are the supported input properties for this node, based on your provided definition:
| Display Name | Type | Description |
|---|---|---|
| Options | collection | A group of advanced configuration options for browser behavior and performance. |
| ├─ Batch Size | number | Maximum number of pages to open simultaneously. Higher values use more memory/CPU. |
| ├─ Browser WebSocket Endpoint | string | WebSocket URL to connect to an existing browser instance instead of launching a new one. |
| ├─ Emulate Device | options | Emulate a specific device (e.g., mobile, tablet). |
| ├─ Executable path | string | Path to the browser executable. Ignored if connecting via WebSocket. |
| ├─ Extra Headers | fixedCollection | Custom HTTP headers to send with requests. |
| ├─ File Name | string | File name for binary output (PDF/Screenshot). |
| ├─ Launch Arguments | fixedCollection | Additional command-line arguments for the browser. |
| ├─ Timeout | number | Max navigation time in ms (0 disables timeout). |
| ├─ Protocol Timeout | number | Max protocol response wait time in ms (0 disables timeout). |
| ├─ Wait Until | options | When to consider navigation successful (load, domcontentloaded, networkidle0, networkidle2). |
| ├─ Page Caching | boolean | Enable/disable page-level caching. |
| ├─ Headless mode | boolean | Run browser in headless mode (no UI). |
| ├─ Use Chrome Headless Shell | boolean | Use chrome-headless-shell (requires it in $PATH). |
| ├─ Stealth mode | boolean | Makes detection of headless Puppeteer harder. |
| ├─ Human typing mode | boolean | Simulates human-like typing in input fields. |
| ├─ Human Typing Options | collection | Fine-tune delays and typo probabilities for human typing simulation. |
| ├─ Proxy Server | string | Use a custom proxy (e.g., localhost:8080, socks5://localhost:1080). |
| └─ Add Container Arguments | boolean | Adds recommended args for container environments (e.g., --no-sandbox). |
Output
The structure of the output depends on the operation performed. In general, each output item contains:
json:
- For page content:
{ "body": "<html>...</html>", "headers": { ... }, "statusCode": 200, "url": "https://example.com" } - For errors:
{ "error": "Error message", "url": "https://example.com" // optional } - For other operations, relevant metadata (headers, status code, url).
- For page content:
binary (for PDF or Screenshot operations):
- Contains the file data under the property name specified by the user (e.g., "data").
- The binary field includes the file with correct MIME type (
image/png,image/jpeg, orapplication/pdf), and the filename if set.
pairedItem:
- Links the output to the corresponding input item.
Dependencies
External Services / Libraries:
- puppeteer-extra
- puppeteer-extra-plugin-stealth
- puppeteer-extra-plugin-human-typing
- puppeteer
- @n8n/vm2 (for running custom scripts)
Environment Variables (optional):
NODE_FUNCTION_ALLOW_BUILTINNODE_FUNCTION_ALLOW_EXTERNALCODE_ENABLE_STDOUT
n8n Configuration:
- If using "Use Chrome Headless Shell", ensure
chrome-headless-shellis available in$PATH. - For proxy usage, ensure the proxy server is accessible from the n8n environment.
- If using "Use Chrome Headless Shell", ensure
Troubleshooting
Common issues:
Browser fails to launch:
- Check that the browser executable exists at the specified path, or that Docker/container permissions allow execution.
- If using "Use Chrome Headless Shell", ensure it's installed and in
$PATH.
Timeouts:
- Increase the "Timeout" or "Protocol Timeout" values if pages take longer to load.
- Setting these to 0 disables the respective timeouts.
Invalid URL error:
- Ensure the "URL" parameter is a valid, fully qualified URL.
Proxy errors:
- Verify the proxy server address and credentials.
- Ensure the proxy is reachable from the n8n host.
Custom script errors:
- Scripts must return an array of items. If not, you'll see:
"Custom script must return an array of items. Please ensure your script returns an array, e.g., return [{ key: value }]." - Syntax errors or runtime exceptions in the script will be reported in the output's
errorfield.
- Scripts must return an array of items. If not, you'll see:
Resource limitations:
- High "Batch Size" or many simultaneous pages may exhaust system resources (memory/CPU).