FireCrawl icon

FireCrawl

FireCrawl API

Overview

The FireCrawl node's "Submit A Crawl Job With A Webhook" operation allows you to initiate a web crawling job for a specified URL and receive the results via a webhook. This is useful for automating website data extraction, content monitoring, or structured data collection workflows. For example, you might use this node to crawl a blog, extract articles in Markdown or HTML, and have the results sent to your own endpoint for further processing.

Properties

Below are the input properties supported by this operation:

Display Name Type Description
Url String (required) The URL to crawl.
Limit Number Max number of results to return. Minimum value: 1. Default: 50.
Webhook String URL to send webhook events to.
Exclude Paths Collection List of paths to exclude from the crawl. You can add multiple items (strings).
Scrape Options Collection Scraping options, including output formats and extraction configuration:
- Formats Multi-select: Output format(s) for the scraped data. Options: Markdown, Html, Extract.
- Extract Structured extraction settings:
- Schema The schema for structured data extraction.
- Systemprompt The system prompt used for extraction.
- Prompt Extraction prompt without schema.
Use Custom Body Boolean Whether to use a custom body for the request. If enabled, all other fields are hidden.
Custom Body JSON Custom body to send. Allows full control over the request payload.

Output

  • The node outputs a json object containing the response from the FireCrawl API after submitting the crawl job.
  • The structure of the output will typically include information about the submitted job, such as job ID, status, and possibly details about the webhook registration.
  • If binary data is returned (not typical for this operation), it would represent the crawled content or extracted files.

Dependencies

  • External Service: Requires access to the FireCrawl API.
  • API Key/Credentials: You must configure the "FireCrawl API" credentials in n8n, including the baseUrl.
  • Webhook Endpoint: You need a publicly accessible webhook URL to receive crawl results.

Troubleshooting

  • Missing or Invalid Credentials:
    Error: Authentication errors if the FireCrawl API credentials are not set or incorrect.
    Resolution: Ensure the "FireCrawl API" credentials are configured correctly in n8n.

  • Invalid URL or Parameters:
    Error: The API may reject requests with invalid URLs or malformed parameters.
    Resolution: Double-check the "Url" field and any custom body JSON for correctness.

  • Webhook Not Receiving Data:
    Error: No data received at the specified webhook.
    Resolution: Make sure the webhook URL is correct, publicly accessible, and able to accept POST requests.

  • Custom Body Errors:
    Error: Malformed JSON in the "Custom Body" field can cause request failures.
    Resolution: Validate your JSON before submitting.

Links and References

Discussion