Actions7
- Content Actions
- Navigation Actions
Overview
This node enables interaction with websites through a cloud-based browser instance. It supports navigating to URLs, retrieving HTML content, taking screenshots, generating PDFs, and performing browser control operations such as opening, navigating, clicking elements, and closing the browser.
Common scenarios include:
- Extracting the full HTML content of a webpage for data scraping or analysis.
- Capturing screenshots of webpages for visual monitoring or reporting.
- Generating PDFs of webpages for archiving or sharing.
- Automating browser interactions like clicking elements or navigating programmatically.
Practical example: Automatically navigate to a product page, extract its HTML content, and save it for further processing or monitoring price changes.
Properties
| Name | Meaning |
|---|---|
| URL to Navigate | The URL to open in the browser. Required for navigation and content retrieval operations. |
| Navigation Options | Options controlling page navigation behavior: - Wait Until: When to consider navigation finished (load, domcontentloaded, networkidle0, networkidle2). - Timeout (Ms): Max wait time in milliseconds. |
| Browser Configuration | Settings for the browser instance: - Browser Type: Chrome, Chromium, or ChromeHeadlessShell. - Headless Mode: Run browser without UI. - Stealth Mode: Enable stealth to avoid detection. - Keep Open (Seconds): Time before auto-closing (0 = never). - Label: Instance name. - Save Session: Save session for reuse. - Recover Session: Recover saved session. |
| Custom Arguments | Additional command-line arguments to pass to the browser on startup. |
| Ignored Default Arguments | Default browser arguments to ignore when launching. |
| Proxy Configuration | Proxy server settings: - Host - Port - Username - Password |
Output
The output JSON structure varies by operation but generally includes:
For Get HTML From Website:
title: Page title.url: Final URL after navigation.content: Full HTML content of the page.
For Get Screenshot From Website:
url: Final URL.title: Page title.screenshot: Base64-encoded image data URI.screenshotBinary: Raw binary screenshot buffer.filename: Generated filename with timestamp.fileExtension: Image file extension (pngorjpg).mimeType: MIME type of the image.
For Get PDF From Website:
url: Final URL.title: Page title.pdf: Base64-encoded PDF data URI.pdfBinary: Raw binary PDF buffer.filename: Generated filename with timestamp.fileExtension: Alwayspdf.mimeType: Alwaysapplication/pdf.
For Navigation operations (open, goto, clickOnPage, close):
- Status messages indicating success.
- Relevant details like WebSocket address, session ID, clicked selector, etc.
If an error occurs and "Continue On Fail" is enabled, the output will contain an error field with the message.
Dependencies
- Requires an API token credential for authenticating with the external cloud browser service.
- Uses Puppeteer library to connect to and control the browser instances via WebSocket.
- Network access to
https://production.cloudbrowser.ai/api/v1/Browser/Openfor opening browser sessions. - Optional proxy configuration support for routing browser traffic.
Troubleshooting
- No WebSocket address received from the browser service: Indicates failure to open a browser session. Check API token validity, service availability, and request parameters.
- No page found in the browser (during click operation): The browser instance has no active pages; ensure the browser is properly opened and navigated before clicking.
- Timeout errors during navigation: Adjust the "Timeout (Ms)" navigation option to allow more time for slow-loading pages.
- Authentication errors: Verify that the API token credential is correctly configured and has necessary permissions.
- If the node fails unexpectedly, enabling "Continue On Fail" can help capture error messages in output for debugging.
Links and References
- Puppeteer Documentation
- Cloud browser service API endpoint used internally:
https://production.cloudbrowser.ai/api/v1/Browser/Open(for reference only) - General web scraping best practices and legal considerations.