Exa Websets icon

Exa Websets

Create, manage, and query structured datasets from web sources using Exa Websets API

Actions28

Overview

The node "Exa Websets" enables users to create, manage, and query structured datasets derived from web content using the Exa Websets API. Specifically, the Search - Create operation allows performing targeted searches within a specified webset to find relevant content based on a user-defined query and various filtering and configuration options.

This node is beneficial in scenarios where you want to extract meaningful information from large collections of web data organized into websets. For example:

  • Researchers looking for recent academic papers or news articles on a specific topic.
  • Marketing teams searching for company mentions or social media discussions.
  • Analysts filtering financial reports or blog posts related to a market segment.

By configuring search parameters such as content category, language, date ranges, and domain filters, users can tailor the search results to their precise needs.

Properties

Name Meaning
Webset ID The unique identifier of the webset within which to perform the search.
Query The search string or keywords to execute the search with (e.g., "machine learning applications in healthcare").
Search Configuration A collection of optional settings to refine the search:
  Category Filter results by content type. Options include Blog Post, Company, Financial Report, Forum Discussion, GitHub, LinkedIn, News Article, PDF, Research Paper, Tweet, Wikipedia. Default is Company.
  Count Number of search results to return (1 to 1000). Default is 10.
  End Crawl Date ISO 8601 formatted end date to limit content crawling time (e.g., 2024-12-31T23:59:59Z).
  End Published Date ISO 8601 formatted end date to limit published content date.
  Exclude Domains Comma-separated list of domains to exclude from search results (e.g., "reddit.com, twitter.com").
  Include Content Boolean flag to include full extracted content of found pages. Default is true.
  Include Domains Comma-separated list of domains to restrict search to (e.g., "techcrunch.com, wired.com").
  Include Highlights Boolean flag to include highlighted excerpts from the found content. Default is false.
  Include Summary Boolean flag to include AI-generated summaries of the found content. Default is false.
  Language Filter results by content language. Options: Chinese, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish. Default is English.
  Search Type Type of search algorithm to use. Options: Auto, Keyword, Neural. Default is Auto.
  Start Crawl Date ISO 8601 formatted start date to limit content crawling time.
  Start Published Date ISO 8601 formatted start date to limit published content date.
  Use Autoprompt Boolean flag to enable an autoprompt feature that improves search results. Default is true.

Output

The node outputs a JSON array containing the search results matching the query and configuration. Each item in the output typically includes metadata about the found content such as title, URL, publication date, and optionally:

  • Full extracted content if Include Content is enabled.
  • Highlighted excerpts if Include Highlights is enabled.
  • AI-generated summaries if Include Summary is enabled.

The exact structure depends on the API response but generally provides rich, structured data representing each matched document or item.

No binary data output is indicated for this operation.

Dependencies

  • Requires an active connection to the Exa Websets API via an API key credential configured in n8n.
  • Network access to https://api.exa.ai endpoint.
  • Properly configured credentials with permissions to perform search operations on the specified websets.

Troubleshooting

  • Common Issues:

    • Invalid or missing Webset ID will cause the search to fail.
    • Incorrect date formats for crawl or published date filters may result in errors or no results.
    • Overly restrictive domain inclusion/exclusion lists might yield empty results.
    • Using unsupported languages or categories could lead to unexpected behavior.
  • Error Messages:

    • "Unknown resource": Indicates the resource parameter is incorrect or not supported.
    • API errors related to authentication usually mean the API key credential is invalid or expired.
    • Validation errors on input parameters suggest required fields are missing or incorrectly formatted.
  • Resolutions:

    • Verify all required properties are set correctly.
    • Ensure dates follow ISO 8601 format.
    • Check API key validity and permissions.
    • Adjust filters to broaden search scope if no results are returned.

Links and References

Discussion