N8N Tools - Document Processor
Overview
The N8N Tools - Document Processor node enables document processing through OCR (Optical Character Recognition), text extraction, and AI-powered analysis using the N8N Tools platform. It supports multiple input sources including binary files, URLs, and Base64-encoded documents. The node sends documents to an external API for processing and can return results in various formats.
This node is beneficial when you need to automate extracting text or structured data from scanned documents, PDFs, or images. For example:
- Extracting invoice data or receipts for accounting automation.
- Converting scanned contracts into searchable text.
- Performing AI-based classification or metadata extraction on documents.
- Processing documents from URLs or directly uploaded files.
Properties
| Name | Meaning |
|---|---|
| Input Source | Selects the source of the document to process: • Binary File — process a file from binary data in the workflow. • URL — process a publicly accessible document URL. • Base64 — process a Base64 encoded document string. |
| Binary Property | (Required if Input Source is Binary File) The name of the binary property containing the file to process. |
| Document URL | (Required if Input Source is URL) The publicly accessible URL of the document to process. |
| Document (Base64) | (Required if Input Source is Base64) The Base64 encoded content of the document. |
| Processing Options | Collection of options to customize processing: • Output Format: JSON, Text, Markdown, or HTML for extracted content. • Language: OCR language code (e.g., auto, en, pt, es, fr, de). • Include Images: whether to extract images. • Include Tables: whether to preserve table structures. • AI Analysis: enable AI-powered content analysis. • Extract Metadata: extract document metadata like author and creation date. |
| Output | How to return processed results: • JSON Response — return processed data as JSON. • Binary File — return processed document as a binary file. • Both — return both JSON data and binary file. |
Output
The node outputs an array of items with the following structure depending on the selected output mode:
JSON Response:
Thejsonfield contains the processed document data returned by the API, which may include extracted text, tables, images (if requested), metadata, AI analysis results, job ID, processing time, and success status.Binary File:
Thebinaryfield contains the processed document file (e.g., PDF) as binary data, along with relevant metadata such as filename and MIME type. Thejsonfield includes operation metadata and any warnings if binary data is missing.Both:
Combines the above two outputs, providing both the JSON data and the processed document as a binary file.
If asynchronous processing is used, the node polls the job status up to 3 times with 10-second intervals before falling back to synchronous processing or throwing a timeout error.
Dependencies
- Requires an API key credential for the N8N Tools platform to authenticate requests.
- Needs network access to the N8N Tools API endpoint.
- The node uses HTTP requests to communicate with the external document processing service.
- No additional environment variables are required beyond the API key credential configuration.
Troubleshooting
Invalid subscription or API key:
Error message:"N8N Tools API: Invalid subscription or API key. Please check your credentials."
Resolution: Verify that the API key credential is correct and active.No binary data found under property:
Error message:"No binary data found under property \"<propertyName>\""
Resolution: Ensure the specified binary property exists in the input data and contains valid binary content.Unknown operation or input source:
Error message:"Unknown operation: <operation>"or"Unknown input source: <inputSource>"
Resolution: Confirm that the operation and input source values are valid and supported.Document processing timeout:
Error message:"Document processing timeout after 30 seconds. JobId: <jobId>. This may indicate the Redis service is not connected or the document processing queue is not running."
Resolution: Check the availability and health of the external processing service and its dependencies.Async and sync processing failed:
Error message:"Both async and sync processing failed. Async timeout after 30 seconds (JobId: <jobId>), sync error: <error message>"
Resolution: Investigate network issues, API service status, or malformed input data.Missing or invalid document URL/Base64:
If the input source is URL or Base64, ensure the provided URL is publicly accessible and the Base64 string is correctly formatted.
Links and References
- N8N Documentation – General n8n usage and node development.
- N8N Tools Platform – Official site for the document processing API (hypothetical link).
- OCR Language Codes – Reference for supported OCR languages.
This summary is based solely on static analysis of the provided source code and property definitions.