Hyperbrowser icon

Hyperbrowser

Interact with websites using Hyperbrowser

Overview

The Hyperbrowser node enables automated interaction with websites using a variety of operations powered by the Hyperbrowser SDK. It supports tasks such as browser automation, web crawling, scraping, and data extraction. This node is useful for scenarios where you want to automate complex browsing tasks, gather structured data from websites, or control browser actions programmatically.

For example, you can instruct the node to:

  • Automate a login process and form filling on a website.
  • Crawl a website to collect content from multiple pages.
  • Scrape specific information from a webpage in different formats (HTML, Markdown, or links).
  • Extract structured data based on custom queries and schemas.
  • Use AI agents to perform browser-based tasks with vision capabilities.

This flexibility makes it ideal for workflows involving web automation, data collection, and AI-driven browsing tasks.

Properties

Name Meaning
Task Instructions for browser automation, e.g., "Click the login button and fill in the form".
Options Collection of additional settings:
- Max Steps Maximum number of steps the agent should take to complete the task (default 25).
- Use Vision Whether to enable vision capabilities for the Browser Use operation (default true).
- Use Proxy Whether to use a proxy server during scraping or crawling (default false).
- Proxy Country Country code for the proxy server if proxy is enabled (default empty, e.g., "US").
- Solve CAPTCHAs Whether to attempt solving CAPTCHAs encountered during scraping (default false).
- Timeout (Ms) Maximum time in milliseconds to wait when navigating to a page (default 15000 ms).
- Maximum Pages Maximum number of pages to crawl when performing a crawl operation (default 10).
- Only Main Content Whether to return only the main content of the page during scrape or crawl (default true).
- Output Format Format of the output content for scrape or crawl operations: HTML, Links, or Markdown.

Output

The node outputs JSON data containing the results of the selected operation:

  • For Browser Use operation:

    • actions: The final result of the browser automation task, typically a summary or outcome of the instructed actions.
    • status: Status of the operation execution.
  • For Scrape operation:

    • url: The URL that was scraped.
    • content: The scraped content in the requested format (HTML, Markdown, or links).
    • status: Status of the scraping operation.
  • For Crawl operation:

    • url: The starting URL crawled.
    • data: Aggregated data collected from crawling multiple pages.
    • status: Status of the crawling operation.
  • For Extract operation:

    • url: The URL from which data was extracted.
    • extractedData: The structured data extracted according to the provided query and schema.
    • status: Status of the extraction operation.

If an error occurs and the node is configured to continue on failure, the output will include an error field describing the issue along with the attempted task.

Dependencies

  • Requires an API key credential for the Hyperbrowser service to authenticate requests.
  • Uses the @hyperbrowser/sdk package internally to interact with the Hyperbrowser API.
  • No additional environment variables are explicitly required beyond the API key credential.
  • Network access is needed for web interactions, optionally via proxies if enabled.

Troubleshooting

  • Common issues:

    • Invalid or missing API key credential will cause authentication failures.
    • Network timeouts if the target website is slow or unresponsive; consider increasing the timeout option.
    • CAPTCHA challenges may block scraping unless the "Solve CAPTCHAs" option is enabled.
    • Proxy misconfiguration can lead to connection errors; verify proxy country codes and availability.
    • Unsupported operations will throw an error indicating the operation is not supported.
  • Error messages:

    • "Operation \"<operation>\" is not supported": Indicates an invalid operation parameter; ensure the operation name matches one of the supported options.
    • Network or HTTP errors from the Hyperbrowser API will be surfaced as error messages; check connectivity and credentials.
    • JSON parsing errors for extraction schema if the schema input is malformed.

To resolve errors, verify all input parameters, credentials, and network settings. Enable "Continue On Fail" to handle errors gracefully within workflows.

Links and References

Discussion