Firecrawl

Firecrawl是一个LLM友好的网页爬虫系统

Actions9

Overview

The node "Firecrawl" integrates with the Firecrawl web crawling system, which is designed to perform various web scraping and crawling tasks. Specifically, for the resource V1 and operation 取消获取整站任务 Crawl/{ID} (Cancel Whole Site Crawl Task), this node allows users to cancel an ongoing whole-site crawling task by specifying its task ID.

This functionality is useful in scenarios where a user has initiated a large-scale crawl of a website but needs to stop it before completion—perhaps due to changing requirements, errors detected in the crawl, or resource constraints.

Practical examples:

A marketing analyst starts a full website crawl to gather data but realizes the crawl is taking too long and wants to cancel it.
A developer automates web scraping jobs and includes logic to cancel tasks that exceed a time limit or are no longer needed.

Properties

Name	Meaning
任务ID	The unique identifier of the crawl task to be cancelled. This is required to specify which crawl job to stop.
返回格式	(Not applicable for this operation) Available only for other operations like searching webpages. Options include: Extract, HTML, Links, Markdown, 原始HTML (Raw HTML), 整个页面截图 (Full Page Screenshot), 网页截图 (Webpage Screenshot).

Note: For the 取消获取整站任务 Crawl/{ID} operation, only the "任务ID" property is relevant and required.

Output

The node outputs JSON data representing the result of the cancellation request. Typically, this would include confirmation of the cancellation status, any error messages if the cancellation failed, or metadata about the task.

No binary data output is indicated for this operation.

Dependencies

Requires an API key credential for authenticating with the Firecrawl service.
The base URL for the Firecrawl API must be configured in the node credentials.
The node sends HTTP requests to the Firecrawl API endpoint corresponding to the V1 resource.

Troubleshooting

Common issues:
- Providing an invalid or non-existent task ID will likely result in an error response from the API.
- Network connectivity problems or incorrect API base URL configuration can cause request failures.
- Missing or invalid API authentication credentials will prevent successful communication with the Firecrawl service.
Error messages:
- Errors indicating "task not found" suggest the provided task ID does not exist or has already been completed/cancelled.
- Authentication errors indicate issues with the API key or credential setup.
- HTTP status errors are ignored by default (ignoreHttpStatusErrors: true), so the node may return error details in the JSON output rather than throwing exceptions.

Links and References

Firecrawl official documentation (not provided here; consult your Firecrawl API docs)
n8n documentation on creating and using custom nodes and credentials

FirecrawlInstall