Actions49
- Airbnb Actions
- AppStore Actions
- Booking Actions
- Company Insight Actions
- Emails & Contact Actions
- Expedia Actions
- G2 Review Actions
- Geocoding Actions
- GetApp Review Actions
- Google Map Actions
- Google Search Actions
- Indeed Actions
- Phone Identity Finder Actions
- Phones Owner Actions
- Product Hunt Actions
- Target Review Actions
- TikTok Profile Actions
- TripAdvisor Actions
- Trustpilot Actions
- Twitter Profile Actions
- Universal AI Scraper Actions
- Walmart Review Actions
- WebPage Screenshoter Actions
- Whitepages Addresses Scraper Actions
- Yellow Pages Search Actions
- Yelp Actions
- YouTube Actions
- Zillow Actions
Overview
The Universal AI Scraper node in the Outscraper integration allows users to scrape data from arbitrary web pages by specifying URLs and attributes to extract. This node is useful for gathering structured information from websites where no dedicated API or pre-built scraper exists. It supports asynchronous scraping with webhook callbacks, enabling efficient handling of long-running tasks.
Typical use cases include:
- Extracting product details, prices, or reviews from e-commerce pages.
- Collecting contact information or metadata from company websites.
- Gathering custom data points from any public webpage by specifying HTML attributes.
For example, a user can input a list of URLs pointing to product pages and specify attributes like "price", "title", or "description" to retrieve those values in a structured format.
Properties
| Name | Meaning |
|---|---|
| Query | Links to web pages to scrape (e.g., https://www.apple.com/iphone/). Required. |
| Attributes | Comma-separated list of attributes to parse from the web page (e.g., price,title,description). |
| Limit | Maximum number of results to return. Minimum value is 1. Default is 50. |
| Async Request | Boolean flag indicating whether to make the request asynchronously. If true, the task runs async. |
| Webhook | URL to which Outscraper will send a POST request once the async task finishes (callback URL). |
| Additional Fields | Collection of optional fields: |
| - Fields | Specific fields to return, comma-separated. |
| - UI | Boolean flag to execute the request as a UI task (affects how the scraping is performed). |
Output
The node outputs JSON data containing the scraped results from the specified web pages. The structure typically includes an array of objects representing each scraped item, with keys corresponding to the requested attributes or fields.
If the async option is enabled, the output may initially contain task status information, and the final scraped data will be sent to the provided webhook URL upon completion.
No binary data output is indicated for this operation.
Dependencies
- Requires an active Outscraper API key credential configured in n8n.
- The node sends HTTP requests to the Outscraper API endpoint defined by the user's credentials.
- For asynchronous scraping, a publicly accessible webhook URL must be provided to receive callback POST requests.
Troubleshooting
Common issues:
- Invalid or missing API key: Ensure the API key credential is correctly set up.
- Incorrect query URLs: Verify that the URLs are valid and accessible.
- Attribute parsing errors: Confirm that the specified attributes exist on the target pages.
- Webhook failures: When using async mode, ensure the webhook URL is reachable and accepts POST requests.
Error messages:
- Authentication errors usually indicate invalid API credentials.
- Rate limit exceeded errors suggest too many requests; consider adding delays or reducing limits.
- Timeout or network errors may occur if the target website is slow or unreachable.
Resolving these typically involves checking credentials, validating inputs, and ensuring network connectivity.