Overview
This node allows you to extract elements from a web page using an XPath expression. It leverages a browser automation environment to locate elements matching the XPath and retrieve their attributes, text content, and/or HTML markup. This is useful for web scraping, data extraction, or automating workflows that require reading structured information from web pages.
Typical use cases include:
- Extracting product details (like names, prices) from e-commerce sites.
- Collecting article headlines or summaries from news websites.
- Gathering metadata or links from any structured HTML content.
Properties
| Name | Meaning |
|---|---|
| XPath | The XPath expression used to find elements on the page. |
| Get Attributes | Whether to retrieve all attributes of each found element as key-value pairs. |
| Get Text Content | Whether to retrieve the text content inside each found element. |
| Get HTML | Whether to retrieve the inner and outer HTML markup of each found element. |
Output
The node outputs an array of JSON objects, one per matched element. Each object contains:
xpath: The XPath expression used for the search.found: A boolean always set totrueindicating the element was found.- Depending on the selected properties, it may also include:
textContent: The textual content inside the element.innerHTML: The inner HTML markup of the element.outerHTML: The full HTML markup including the element itself.attributes: An object mapping attribute names to their values for the element.
No binary data output is produced by this node.
Dependencies
- Requires a running browser instance managed by the node's browser manager component.
- The browser must be launched before executing this node; otherwise, it will throw an error.
- No external API keys or credentials are needed, but the environment must support browser automation.
Troubleshooting
- Browser not running error: If the node throws "Browser is not running. Please launch the browser first.", ensure that a browser session has been started in the workflow prior to this node.
- XPath required error: The node requires a non-empty XPath string. Make sure the XPath property is correctly set.
- Failed to get elements error: This indicates issues during element retrieval, possibly due to invalid XPath syntax or page load problems. Verify the XPath expression and confirm the page is fully loaded.
- If no elements are found, the output will be an empty array; check your XPath correctness and page state.
Links and References
- XPath Tutorial
- Playwright Documentation (for understanding browser automation concepts)
- n8n Documentation (general workflow and node usage guidance)