Exa Websets icon

Exa Websets

Create, manage, and query structured datasets from web sources using Exa Websets API

Actions28

Overview

The Exa Websets node allows users to create, manage, and query structured datasets called "websets" that aggregate web content from various sources. Specifically, the Preview operation under the Webset resource enables users to perform a search query on these websets to preview relevant content before creating or updating a webset.

This node is beneficial in scenarios where you want to gather curated web content filtered by categories, dates, languages, domains, or specific entities. For example, marketing teams can preview recent news articles about a competitor, researchers can find relevant research papers within a date range, or analysts can monitor social media posts filtered by language and domain.

Practical examples:

  • Preview the latest 20 blog posts mentioning "artificial intelligence startups" published in English.
  • Search for financial reports related to a specific company, excluding certain domains.
  • Retrieve tweets in multiple languages about a trending topic using neural search.

Properties

Name Meaning
Search Configuration A collection of parameters to customize the search query:
- Category Filter content by type/category. Options include Blog Post, Company, Financial Report, Forum Discussion, GitHub, LinkedIn, News Article, PDF, Research Paper, Tweet, Wikipedia.
- Count Number of results to return (1 to 1000).
- End Crawl Date ISO 8601 formatted end date to limit content crawling time.
- End Published Date ISO 8601 formatted end date to limit content publication time.
- Entity Specific entity name to search for (e.g., "OpenAI").
- Exclude Domains Comma-separated list of domains to exclude from search results (e.g., "reddit.com, twitter.com").
- Include Domains Comma-separated list of domains to restrict search results to (e.g., "techcrunch.com, wired.com").
- Language Filter content by language. Options include Any Language, Chinese, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish.
- Query The search query string to find relevant content (e.g., "artificial intelligence startups").
- Search Type Type of search algorithm to use. Options are Auto, Keyword, Neural.
- Start Crawl Date ISO 8601 formatted start date to limit content crawling time.
- Start Published Date ISO 8601 formatted start date to limit content publication time.
- Use Autoprompt Boolean flag to enable or disable an autoprompt feature designed to improve search relevance.

Output

The node outputs JSON data containing the search results matching the specified criteria. Each item in the output represents a piece of content found by the search, including metadata such as title, URL, publication date, category, and possibly enriched information depending on the API response.

If binary data is returned (not explicitly shown in the provided code), it would typically represent downloadable content like PDFs or images linked from the search results.

Dependencies

  • Requires an API key credential for authenticating with the Exa Websets API.
  • The node communicates with the Exa Websets REST API at https://api.exa.ai.
  • No additional external dependencies are indicated beyond this API connection.

Troubleshooting

  • Common Issues:

    • Invalid or missing API credentials will cause authentication failures.
    • Incorrectly formatted ISO 8601 dates may result in errors or empty results.
    • Specifying mutually exclusive domain filters (include and exclude overlapping domains) might yield unexpected results.
    • Requesting too many results (above 1000) will be rejected due to limits.
  • Error Messages:

    • "Unknown resource": Occurs if the resource parameter is not set to a valid option; ensure "websets" is selected.
    • API errors related to invalid parameters will usually include descriptive messages; verify all input fields especially dates and domain lists.
    • Network or connectivity issues will manifest as request failures; check internet connection and API endpoint accessibility.

Links and References

Discussion