Browserless

Browserless API

Actions8

Browser Rest Apis Actions

Overview

This node interacts with the Browserless API to perform web scraping using a browser automation approach. It allows users to navigate to a URL, wait for specific elements or events, inject scripts or styles, set HTTP headers, authenticate, and configure browser launch options. This node is beneficial for extracting data from dynamic web pages that require JavaScript execution or complex interactions, such as scraping product details from e-commerce sites, gathering social media content, or automating form submissions.

Use Case Examples

Scrape product information from an e-commerce site by specifying CSS selectors for product titles and prices.
Wait for a specific element to appear on a page before extracting its content, useful for pages that load data asynchronously.
Inject custom JavaScript into a page to manipulate the DOM or extract data not directly accessible via selectors.

Properties

Name	Meaning
Url	The URL of the web page to scrape. This is a required input.
Elements	A collection of CSS selectors and optional timeouts to specify which elements to scrape from the page.
Wait For Timeout	Time in milliseconds to wait before proceeding, useful for waiting for page content to load.
Wait For Selector	Specify selectors to wait for with options for visibility, hidden state, and timeout.
Goto Options	Options for navigating to the URL, including referer, timeout, and waitUntil events.
Wait For Event	Specify an event and timeout to wait for during scraping.
Wait For Function	A JavaScript function to evaluate in the browser context, with polling and timeout options.
Add Script Tag	Scripts to inject into the page, specified by URL, path, content, type, and id.
Add Style Tag	CSS styles to inject into the page, specified by URL, path, or raw content.
Set Extra HTTP Headers	Additional HTTP headers to include in requests.
Authenticate	Username and password for HTTP authentication.
Viewport	Settings for the browser viewport, including width, height, device scale factor, and mobile/landscape/touch options.
Emulate Media Type	Media type to emulate in the browser, e.g., screen or print.
Timeout	Override the system-level timeout for the request in milliseconds.
Html	Raw HTML content to load instead of navigating to a URL.
User Agent	User agent string to use for the browser session.
Best Attempt	If true, the node attempts to proceed even if awaited events fail or timeout.
Enable Cookies	Enable or disable cookie handling.
Cookies	Array of cookie objects to set in the browser session.
Block Ads	Enable or disable ad-blocking extensions during the session.
Set Java Script Enabled	Enable or disable JavaScript execution in the browser.
Enable Launch	Whether to launch a new browser instance.
Launch	Options for launching the browser, including arguments, viewport, devtools, headless mode, and more.
Reject Resource Types	Resource types to block from loading, such as images, scripts, or stylesheets.
Reject Request Pattern	Patterns of requests to block during scraping.
Request Interceptors	Patterns and corresponding responses to intercept and fulfill requests.
Debug Opts	Options to enable debugging features like console logs, cookies, HTML, network, and screenshots.
Use Custom Body	Whether to use a custom JSON body for the request instead of the standard parameters.
Custom Body	Custom JSON body to send with the request, allowing full control over scraping parameters.

Output

JSON

data - The scraped data extracted from the web page based on the specified selectors and options.

Dependencies

Browserless API

Troubleshooting

Timeout errors if the page takes too long to load or elements do not appear within the specified timeout. Increase timeout values or use bestAttempt option to mitigate.
Authentication failures if incorrect username or password is provided. Verify credentials.
Issues with selectors not matching any elements. Ensure CSS selectors are correct and elements exist on the page.
Problems with blocked resources causing incomplete page loads. Adjust rejectResourceTypes or blockAds settings.

Browserless

Actions8

Overview

Use Case Examples

Properties

Output

JSON

Dependencies

Troubleshooting

Links

Discussion

BrowserlessInstall

Actions8

Overview

Use Case Examples

Properties

Output

JSON

Dependencies

Troubleshooting

Links

Discussion

Browserless