CloudBrowser

Interact with websites using a cloud-based browser instance

Actions7

Content Actions
Navigation Actions

Overview

The node "CloudBrowser" enables interaction with websites through a cloud-based browser instance. Specifically, the Content - Get PDF From Website operation navigates to a specified URL and generates a PDF snapshot of the webpage. This is useful for automating the capture of web pages as PDFs without needing local browser installations.

Common scenarios include:

Archiving web pages in PDF format for record-keeping or offline access.
Generating printable versions of dynamic web content.
Automating report generation from web dashboards or analytics pages.

Example: Automatically navigate to a news article URL and generate a PDF version for distribution or storage.

Properties

Name	Meaning
URL to Navigate	The URL of the webpage to open and convert into a PDF.
Navigation Options	Options controlling page navigation behavior: - Wait Until: When to consider navigation finished (Load, Domcontentloaded, Networkidle0, Networkidle2). - Timeout (Ms): Max wait time for navigation.
Browser Configuration	Settings for the browser instance: - Browser Type: Chrome, Chromium, or ChromeHeadlessShell. - Headless Mode: Run browser without UI. - Stealth Mode: Enable stealth to avoid detection. - Keep Open (Seconds): Time before auto-closing browser (0 = never). - Label: Instance name. - Save Session: Save session for reuse. - Recover Session: Recover saved session.
Custom Arguments	Additional command-line arguments passed to the browser on startup.
Ignored Default Arguments	Default browser arguments to ignore when launching.
Proxy Configuration	Proxy server settings: - Host, Port, Username, Password.
PDF Options	PDF generation options: - Format: Paper size (A0, A1, A2, A3, A4, A5, A6, Legal, Letter, Tabloid). - Landscape: Generate PDF in landscape orientation. - Print Background: Include background graphics. - Scale: Rendering scale (0.1 to 2). - Margin: Margins in millimeters (Top, Right, Bottom, Left). - Page Ranges: Specific pages to print (e.g., "1-5, 8, 11-13").

Output

The node outputs JSON data containing:

url: The final URL of the loaded webpage.
title: The page title.
pdf: A base64-encoded string representing the generated PDF file, prefixed with the appropriate data URI (data:application/pdf;base64,...).
pdfBinary: The raw binary buffer of the PDF file.
filename: Suggested filename for the PDF, e.g., webpage_<timestamp>.pdf.
fileExtension: Always "pdf".
mimeType: Always "application/pdf".

This output allows downstream nodes to save the PDF file, send it via email, or upload it to storage services.

Dependencies

Requires an active internet connection to reach the target URL.
Uses a cloud browser service accessible via API at https://production.cloudbrowser.ai/api/v1/Browser/Open.
Requires an API token credential for authentication with the cloud browser service.
Puppeteer library is used internally to control the browser session.
No local browser installation needed; all browsing happens remotely.

Troubleshooting

No WebSocket address received from the browser service: Indicates failure to open a browser instance. Check API token validity and service availability.
Navigation timeout: If the page takes too long to load, increase the Timeout property under Navigation Options.
Invalid URL or unreachable site: Ensure the URL is correct and accessible from the cloud browser environment.
PDF generation errors: Verify PDF options are valid; unsupported margin values or page ranges may cause failures.
Session recovery issues: If using saved sessions, ensure session data exists and is not corrupted.
Proxy configuration problems: Incorrect proxy details can prevent navigation; verify host, port, and credentials.

Links and References

Puppeteer Documentation
Cloud Browser Service API (hypothetical link based on usage)
PDF Format Sizes