Remote Playwright icon

Remote Playwright

Interact with Remote Playwright Instance

Overview

The node interacts with a remote Playwright instance to extract data from table rows on a web page. It connects to an existing browser session identified by an instance ID, selects table rows using a CSS selector, and retrieves structured data from those rows. This is useful for automating web scraping tasks where tabular data needs to be extracted dynamically from websites.

Practical examples include:

  • Extracting product listings or pricing tables from e-commerce sites.
  • Gathering user data or reports presented in table format on dashboards.
  • Automating data collection from any web application that displays information in tables.

Properties

Name Meaning
Instance ID The unique identifier of the active Playwright browser instance to interact with.
Close Browser on error Whether to close the browser if an error occurs during the extraction action (true/false).
Selector A CSS selector string specifying which table rows to target for data extraction.

Output

The node outputs JSON data representing the extracted content from the selected table rows. Each item corresponds to a row matched by the selector, containing the structured data parsed from that row's cells.

If the node supports binary data output (not explicitly shown here), it would typically represent screenshots or downloaded files related to the extraction process, but this node focuses on JSON data extraction.

Dependencies

  • Requires connection to a remote Playwright service accessible via an API URL configured in credentials.
  • Needs an active Playwright browser instance identified by the provided Instance ID.
  • The node depends on the remote Playwright API being available and responsive.

Troubleshooting

  • Common issues:

    • Incorrect or expired Instance ID leading to failure in connecting to the browser session.
    • Invalid or overly broad/narrow CSS selectors causing no data or incorrect data to be extracted.
    • Network or authentication issues preventing communication with the remote Playwright API.
  • Error messages and resolutions:

    • "Instance not found" — Verify the Instance ID is correct and the Playwright instance is running.
    • "Selector did not match any elements" — Check and refine the CSS selector to correctly target table rows.
    • "Connection refused or timeout" — Ensure the remote Playwright API endpoint is reachable and credentials are valid.
    • If "Close Browser on error" is enabled, the browser will close automatically on errors; disabling it can help with debugging.

Links and References

Discussion