olyptik

n8n community node for Olyptik web crawling and content extraction API

Package Information

Downloads: 1 weekly / 6 monthly
Latest Version: 0.1.4
Author: Olyptik

Documentation

n8n-nodes-olyptik

Olyptik

This is an n8n community node that lets you use Olyptik in your n8n workflows.

Olyptik is a powerful web crawling and content extraction API that helps you scrape websites, extract structured data, and convert web content to markdown format.

Installation

Follow the installation guide in the n8n community nodes documentation.

  1. Go to Settings > Community Nodes.
  2. Select Install.
  3. Enter n8n-nodes-olyptik in Enter npm package name.
  4. Agree to the risks of using community nodes: select I understand the risks of installing unverified code from a public source.
  5. Select Install.

After installing the node, you can use it like any other node. n8n displays the node in search results in the Nodes panel.

Credentials

This node requires Olyptik API credentials. You can get your API key from your Olyptik Dashboard.

The node supports the following authentication methods:

  • API Key: Your Olyptik API key

Supported Operations

Crawl Resource

  • Create: Start a new web crawl
  • Get: Retrieve information about a specific crawl
  • Query: Search and filter your crawls
  • Abort: Stop a running crawl

Crawl Results Resource

  • Get: Retrieve the results from a completed crawl

Trigger Node

The package also includes an Olyptik Trigger node that can listen for webhooks from Olyptik:

  • Crawl Status Change: Triggers when a crawl status changes (e.g., from running to completed)
  • Crawl Result Created: Triggers when new results are found during crawling

Example Workflows

Basic Web Crawling

  1. Use the Olyptik node to start a crawl
  2. Wait for completion or use the Olyptik Trigger to get notified
  3. Retrieve the crawl results
  4. Process the extracted content

Automated Content Monitoring

  1. Set up an Olyptik Trigger for crawl status changes
  2. When a crawl completes, automatically retrieve the results
  3. Send notifications or process the content as needed

Configuration

Starting a Crawl

Required parameters:

  • Start URL: The website URL to begin crawling
  • Max Results: Maximum number of pages to crawl

Optional parameters:

  • Max Depth: How deep to crawl (default: 10)
  • Engine Type: Choose between Auto, Cheerio (fast), or Playwright (JavaScript-heavy sites)
  • Use Sitemap: Whether to use the website's sitemap.xml
  • Entire Website: Crawl the entire website
  • Include Links: Include links in the extracted markdown
  • Use Static IPs: Use static IP addresses for crawling

Retrieving Results

  • Crawl ID: The ID of the crawl to get results for
  • Page: Page number for pagination
  • Limit: Number of results per page (1-100)

API Documentation

For detailed API documentation, visit: https://docs.olyptik.io

Support

License

MIT

Discussion