Overview
The Hyperbrowser node enables automated interaction with websites using various browser automation and web scraping techniques powered by the Hyperbrowser SDK. It supports multiple operations including browsing with AI agents, crawling websites, extracting data, scraping URLs, and controlling user actions via OpenAI.
This node is beneficial for scenarios such as:
- Automating complex browser tasks like form filling or navigation based on natural language instructions.
- Crawling websites to gather content from multiple pages.
- Extracting structured data from webpages using AI-driven queries.
- Scraping webpage content in different formats (Markdown, HTML, links).
- Using AI agents to control computer/browser actions programmatically.
Practical examples:
- Automatically log into a website and download reports by instructing the node with a task description.
- Crawl an e-commerce site to collect product information across multiple pages.
- Extract pricing or contact details from a webpage using a custom extraction query.
- Scrape blog posts in Markdown format for further processing.
- Use OpenAI-based user action automation to simulate complex workflows.
Properties
| Name | Meaning |
|---|---|
| Task | Instructions for browser automation, e.g., "Click the login button and fill in the form". |
| Options | Collection of optional parameters: |
| - Max Steps | Maximum number of steps the agent should take to complete the task (default 25). |
| - Maximum Pages | Maximum number of pages to crawl when crawling (default 10). |
| - Only Main Content | Whether to return only the main content of the page during scrape/crawl (default true). |
| - Output Format | Format of output content: HTML, Links, or Markdown (default Markdown). |
| - Proxy Country | Country code for proxy server if proxy is used (e.g., "US"). |
| - Solve CAPTCHAs | Whether to solve CAPTCHAs encountered during scraping (default false). |
| - Timeout (Ms) | Maximum timeout in milliseconds for navigating to a page (default 15000 ms). |
| - Use Proxy | Whether to use a proxy server for scraping (default false). |
| - Use Vision | Whether to enable vision capabilities for Browser Use LLM operation (default true). |
Output
The node outputs JSON objects containing results specific to the selected operation:
OpenAI CUA (and other agent-based operations):
actions: The final result of the AI agent's performed actions as a string or structured data.status: Status code or message indicating success or failure.
Scrape:
url: The URL scraped.content: The scraped content in the requested format (Markdown, HTML, or links).status: Status of the scraping operation.
Crawl:
url: The starting URL crawled.data: Aggregated data collected from crawling multiple pages.status: Status of the crawling operation.
Extract:
url: The URL from which data was extracted.extractedData: Data extracted according to the provided query and schema.status: Status of the extraction operation.
If binary data were involved, it would be summarized accordingly; however, this node primarily outputs JSON data related to web content and agent actions.
Dependencies
- Requires an API key credential for the Hyperbrowser service to authenticate requests.
- Depends on the
@hyperbrowser/sdkpackage for interacting with the Hyperbrowser API. - Requires network access to target URLs and optionally proxy configuration if enabled.
- No additional environment variables are explicitly required beyond the API key credential.
Troubleshooting
Common Issues:
- Invalid or missing API key credential will cause authentication failures.
- Network timeouts or unreachable URLs may cause navigation or scraping to fail.
- Incorrect task instructions might lead to incomplete or failed browser automation.
- Using proxy settings incorrectly can block requests or cause CAPTCHAs.
Error Messages:
"Operation "<operation>" is not supported": Occurs if an unsupported operation value is set; verify the operation property.- Timeout errors: Increase the timeout option to allow more time for page loads.
- CAPTCHA-related failures: Enable "Solve CAPTCHAs" option if encountering frequent CAPTCHA challenges.
- Parsing errors on extraction schema: Ensure the JSON schema is valid and correctly formatted.
To handle errors gracefully, enable "Continue On Fail" in the node settings to receive error details in output instead of stopping execution.
Links and References
- Hyperbrowser SDK Documentation
- n8n Documentation on Creating Custom Nodes
- OpenAI API (for understanding AI agent capabilities)