N8N Nodes Crawler icon

N8N Nodes Crawler

Crawl and extract n8n community nodes information from npm registry

Overview

This node, named 'N8N Nodes Crawler', is designed to crawl and extract information about n8n community nodes from the npm registry. It is useful for scenarios where users want to gather metadata about npm packages related to n8n nodes, such as for analysis, monitoring, or integration purposes. For example, it can be used to fetch details about all npm packages starting with 'n8n-nodes-', including their versions, authors, and scores, across multiple pages of search results.

Use Case Examples

  1. A user wants to collect data on all n8n community nodes available on npm to analyze their popularity and update frequency.
  2. A developer needs to monitor new n8n nodes published on npm by crawling the registry regularly.

Properties

Name Meaning
Search Query The search query string used to find npm packages, e.g., 'n8n-nodes-' to find all n8n nodes.
Total Pages Number of pages to crawl. Use '~' to crawl all available pages or specify a number to limit the crawl.
Page Size Number of results per page to fetch from the npm registry, with a maximum of 250.
Request Delay Delay in milliseconds between requests to avoid rate limiting.
Output Format Format of the output data: either split into individual items for each node or combined into a single object with metadata.
Options Additional options including request timeout, retry behavior, SSL/TLS settings, and proxy configuration.

Output

JSON

  • totalCount - Total number of nodes retrieved.
  • searchQuery - The search query used for crawling.
  • nodes
    • ``
      * name - Name of the npm package.
      * version - Version of the npm package.
      * description - Description of the npm package.
      * keywords - Keywords associated with the npm package.
      * author - Author information of the npm package.
      * publisher - Publisher information of the npm package.
      * maintainers - List of maintainers of the npm package.
      * links - Links related to the npm package (e.g., homepage, repository).
      * publishedDate - Date when the npm package was published.
      * score
      * final - Final score of the npm package from npm registry.
      * detail - Detailed scoring metrics of the npm package.
      * searchScore - Search relevance score of the npm package.
  • crawledAt - Timestamp when the crawl was performed.

Dependencies

  • axios for HTTP requests to npm registry
  • Node.js https and http modules for agent configuration

Troubleshooting

  • Invalid 'Total Pages' input: must be '~' or a positive number.
  • Network request failures due to timeout or connectivity issues; can be mitigated by enabling retry and adjusting timeout settings.
  • SSL certificate verification failures if connecting to environments with self-signed certificates; can disable verification in trusted environments.
  • Rate limiting by npm registry if requests are too frequent; use 'Request Delay' to space out requests.
  • Proxy misconfiguration leading to failed requests; ensure proxy settings are correct if used.

Links

Discussion