Overview
This node extracts web page content from one or more URLs using the Tavily Extract API. It is useful for scenarios where you need to programmatically retrieve and analyze content from web pages, such as for data scraping, content aggregation, or research. The node supports extracting basic or advanced content depth and optionally includes images from the pages.
Use Case Examples
- Extract content from a list of URLs to gather article text and metadata.
- Retrieve web page content including images for building a media-rich dataset.
- Use advanced extraction depth to get more detailed data from complex web pages.
Properties
| Name | Meaning |
|---|---|
| URLs | One or more URLs to extract content from. |
| Include Images | Whether to include a list of images extracted from each URL. |
| Extract Depth | How deeply to parse each URL, with 'basic' for standard extraction and 'advanced' for more detailed data at the cost of increased latency. |
Output
JSON
results- The extracted content results from the URLs.failed_results- Any URLs that failed to be processed.response_time- The time taken by the API to respond.
Dependencies
- Requires Tavily API key credential for authentication.
Troubleshooting
- Missing or invalid API key will cause authentication errors; ensure the Tavily API key is correctly set in credentials.
- Providing an empty or non-array URLs input will cause an error; ensure at least one valid URL is provided.
- Network or API errors may occur; check API availability and network connectivity.
Links
- Tavily Extract API Documentation - Official API documentation for the Tavily Extract endpoint used by this node.