Exa Websets icon

Exa Websets

Create, manage, and query structured datasets from web sources using Exa Websets API

Actions28

Overview

The Exa Websets node allows users to create, manage, and query structured datasets called "websets" from various web sources using the Exa Websets API. Specifically, the Create operation under the Webset resource enables users to define a new webset by specifying detailed search configurations that filter and collect relevant web content.

This node is beneficial in scenarios where you want to aggregate and structure web content around specific topics, entities, or categories for further analysis, monitoring, or enrichment. For example, marketing teams can create websets to track news articles and social media posts about their brand, researchers can gather academic papers on a topic, or analysts can monitor financial reports and company data.

Practical examples:

  • Creating a webset to collect recent blog posts and tweets about "artificial intelligence startups".
  • Building a dataset of news articles and research papers published within a certain date range related to a specific company.
  • Filtering content by language and domain to focus on trusted sources only.

Properties

Name Meaning
Search Configuration A collection of parameters defining how to search and filter content for the webset:
- Category Content category to filter by. Options include Blog Post, Company, Financial Report, Forum Discussion, GitHub, LinkedIn, News Article, PDF, Research Paper, Tweet, Wikipedia. Default is Company.
- Count Number of results to return (minimum 1, maximum 1000). Default is 10.
- End Crawl Date ISO 8601 formatted end date for when content was crawled (e.g., "2024-12-31T23:59:59Z").
- End Published Date ISO 8601 formatted end date for when content was published.
- Entity Specific entity name to search for (e.g., "OpenAI").
- Exclude Domains Comma-separated list of domains to exclude from search results (e.g., "reddit.com, twitter.com").
- Include Domains Comma-separated list of domains to include in search results (e.g., "techcrunch.com, wired.com").
- Language Filter content by language. Options include Any Language, Chinese, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish. Default is any language.
- Query Search query string to find relevant content (e.g., "artificial intelligence startups").
- Search Type Type of search to perform. Options are Auto, Keyword, Neural. Default is Auto.
- Start Crawl Date ISO 8601 formatted start date for when content was crawled.
- Start Published Date ISO 8601 formatted start date for when content was published.
- Use Autoprompt Boolean flag to enable Exa's autoprompt feature for improved search results. Default is true.
External ID Optional external identifier string for the created webset.

Output

The node outputs JSON data representing the result of the webset creation request. This typically includes metadata about the newly created webset such as its unique identifier, status, and possibly summary information about the included content or search configuration.

If the node supports binary data output (not indicated explicitly here), it would represent associated files or documents fetched or generated during the operation, but this is not evident from the provided code and properties.

Dependencies

  • Requires an active connection to the Exa Websets API endpoint at https://api.exa.ai.
  • Requires an API authentication token credential configured in n8n to authorize requests to the Exa Websets service.
  • The node depends on internal service modules (WebsetsService etc.) which handle the actual API calls.

Troubleshooting

  • Common issues:

    • Invalid or missing API credentials will cause authentication failures.
    • Incorrectly formatted dates (non-ISO 8601) in date fields may lead to API errors.
    • Specifying conflicting domain filters (include and exclude overlapping domains) might yield unexpected results.
    • Requesting too many results (above 1000) will be rejected due to limits.
  • Error messages:

    • "Unknown resource": Occurs if the resource parameter is set incorrectly; ensure "websets" is selected.
    • API errors returned from the Exa service will be passed through; check error details for invalid parameters or quota issues.
    • Network or connectivity errors should be checked by verifying internet access and API endpoint availability.
  • Resolutions:

    • Verify and re-enter API credentials.
    • Ensure all date inputs follow ISO 8601 format.
    • Adjust count values within allowed range.
    • Review domain filters for logical consistency.

Links and References

Discussion