Actions7
- Content Actions
- Navigation Actions
Overview
The node "CloudBrowser" enables interaction with websites through a cloud-based browser instance. Specifically, the Content - Get PDF From Website operation navigates to a specified URL and generates a PDF snapshot of the webpage. This is useful for automating the capture of web pages as PDFs without needing local browser installations.
Common scenarios include:
- Archiving web pages in PDF format for record-keeping or offline access.
- Generating printable versions of dynamic web content.
- Automating report generation from web dashboards or analytics pages.
Example: Automatically navigate to a news article URL and generate a PDF version for distribution or storage.
Properties
| Name | Meaning |
|---|---|
| URL to Navigate | The URL of the webpage to open and convert into a PDF. |
| Navigation Options | Options controlling page navigation behavior: - Wait Until: When to consider navigation finished (Load, Domcontentloaded, Networkidle0, Networkidle2). - Timeout (Ms): Max wait time for navigation. |
| Browser Configuration | Settings for the browser instance: - Browser Type: Chrome, Chromium, or ChromeHeadlessShell. - Headless Mode: Run browser without UI. - Stealth Mode: Enable stealth to avoid detection. - Keep Open (Seconds): Time before auto-closing browser (0 = never). - Label: Instance name. - Save Session: Save session for reuse. - Recover Session: Recover saved session. |
| Custom Arguments | Additional command-line arguments passed to the browser on startup. |
| Ignored Default Arguments | Default browser arguments to ignore when launching. |
| Proxy Configuration | Proxy server settings: - Host, Port, Username, Password. |
| PDF Options | PDF generation options: - Format: Paper size (A0, A1, A2, A3, A4, A5, A6, Legal, Letter, Tabloid). - Landscape: Generate PDF in landscape orientation. - Print Background: Include background graphics. - Scale: Rendering scale (0.1 to 2). - Margin: Margins in millimeters (Top, Right, Bottom, Left). - Page Ranges: Specific pages to print (e.g., "1-5, 8, 11-13"). |
Output
The node outputs JSON data containing:
url: The final URL of the loaded webpage.title: The page title.pdf: A base64-encoded string representing the generated PDF file, prefixed with the appropriate data URI (data:application/pdf;base64,...).pdfBinary: The raw binary buffer of the PDF file.filename: Suggested filename for the PDF, e.g.,webpage_<timestamp>.pdf.fileExtension: Always"pdf".mimeType: Always"application/pdf".
This output allows downstream nodes to save the PDF file, send it via email, or upload it to storage services.
Dependencies
- Requires an active internet connection to reach the target URL.
- Uses a cloud browser service accessible via API at
https://production.cloudbrowser.ai/api/v1/Browser/Open. - Requires an API token credential for authentication with the cloud browser service.
- Puppeteer library is used internally to control the browser session.
- No local browser installation needed; all browsing happens remotely.
Troubleshooting
- No WebSocket address received from the browser service: Indicates failure to open a browser instance. Check API token validity and service availability.
- Navigation timeout: If the page takes too long to load, increase the Timeout property under Navigation Options.
- Invalid URL or unreachable site: Ensure the URL is correct and accessible from the cloud browser environment.
- PDF generation errors: Verify PDF options are valid; unsupported margin values or page ranges may cause failures.
- Session recovery issues: If using saved sessions, ensure session data exists and is not corrupted.
- Proxy configuration problems: Incorrect proxy details can prevent navigation; verify host, port, and credentials.
Links and References
- Puppeteer Documentation
- Cloud Browser Service API (hypothetical link based on usage)
- PDF Format Sizes