Browser icon

Browser

A node to run a headless browser and take screenshots or PDF

Actions3

Overview

The Browser node with Resource Page and Operation HTML is designed to automate the process of loading a web page in a headless browser (using Puppeteer), optionally customizing the browsing environment, and extracting the full HTML content of the page. This node is particularly useful for scenarios where you need to:

  • Scrape or archive the rendered HTML of dynamic web pages (including those requiring JavaScript execution).
  • Automate website testing or monitoring by capturing the final DOM after all scripts/styles have loaded.
  • Pre-render single-page applications or other JS-heavy sites for SEO or further processing.

Practical examples:

  • Extracting the HTML of a product page after all client-side rendering has completed.
  • Capturing the state of a dashboard or report page for archiving or analysis.
  • Fetching the HTML output of a page that requires custom headers, cookies, or injected scripts/styles.

Properties

Name Type Meaning
Target URL string The URL of the web page to load and extract HTML from.
Options collection Additional settings to control browser behavior (see below for sub-options).
└ Timeout number Maximum time (ms) to wait for the page to load.
└ User-Agent string Custom user agent string for the browser session.
└ Element string CSS selector to wait for before proceeding (ensures element is visible before extraction).
└ Clip to Element? boolean Whether to clip screenshots to a specific element (not relevant for HTML operation).
└ Full Page? boolean Whether to capture the entire page or just the viewport (not relevant for HTML operation).
└ Disable Javascript? boolean If true, disables JavaScript execution on the page.
└ Scroll To string CSS selector of an element to scroll into view before extraction.
└ Javascript Code string Custom JavaScript code to execute in the page context; result is included in output.
└ Headers collection List of HTTP headers to send with the request.
└ Styles collection List of style tags (by URL or content) to inject into the page before rendering.
└ Scripts collection List of script tags (by URL or content) to inject before rendering.
Viewport collection Settings for browser viewport size and device emulation.
└ Width number Width of the browser viewport in pixels.
└ Height number Height of the browser viewport in pixels.
└ Scale Factor number Device scale factor (for HiDPI/Retina emulation).
└ Is Mobile boolean Emulate mobile device if true.
└ Is Touchscreen boolean Emulate touchscreen if true.
└ Is Lansdcape boolean Use landscape orientation if true.

Output

The node outputs a JSON object with the following structure:

{
  "url": "<the target URL>",
  "title": "<page title>",
  "metrics": { /* browser performance metrics */ },
  "evaluateResponse": "<result of custom JS code, if provided>",
  "content": "<full HTML content of the page>"
}
  • url: The URL that was loaded.
  • title: The document's title.
  • metrics: An object containing various browser/page performance metrics as reported by Puppeteer.
  • evaluateResponse: (Optional) The result of any custom JavaScript code executed on the page.
  • content: The complete HTML markup of the loaded page, including any modifications made by scripts/styles.

Note: No binary data is produced for the HTML operation.


Dependencies

  • External Service: Requires access to the internet to fetch the target URL.
  • Library: Uses puppeteer for headless browser automation.
  • n8n Configuration: No special credentials required for this operation. Node runs Chrome/Chromium in headless mode.

Troubleshooting

Common Issues:

  • Timeouts:

    • Error: Navigation timeout of X ms exceeded
      Cause: The page took too long to load.
      Solution: Increase the "Timeout" option or check network connectivity.
  • Element Not Found:

    • Error: waiting for selector ... failed: timeout
      Cause: The specified CSS selector in "Element" or "Scroll To" did not match any element.
      Solution: Double-check the selector or increase the timeout.
  • Blocked by Website:

    • Some websites may block headless browsers or require specific headers/cookies.
      Solution: Set a realistic "User-Agent" and add necessary headers.
  • JavaScript Disabled:

    • If "Disable Javascript?" is enabled, dynamic content may not render.
      Solution: Only disable JavaScript if you are sure the page does not require it.
  • Invalid Custom Code:

    • Error: ReferenceError or similar when running custom JavaScript.
      Solution: Ensure your code is valid and does not reference unavailable variables.

Links and References

Discussion