Overview
The node integrates with Docutray services to process documents, specifically focusing on identifying document types from images. It supports multiple input methods for the image data: binary data from a previous node, Base64 encoded images, or image URLs. The node sends the image along with possible document type codes and optional metadata to the Docutray API, which returns identification results.
This node is beneficial in scenarios where automated document classification or recognition is needed, such as:
- Automatically sorting scanned documents by type.
- Extracting document type information from uploaded images.
- Integrating document processing workflows that require document type identification before further processing.
For example, a user might upload an image of a bank statement or credit card statement, provide expected document type codes, and receive structured identification results to route the document accordingly.
Properties
| Name | Meaning |
|---|---|
| Input Method | Method to provide the image for identification. Options: Binary Data (image from previous node), Base64 (Base64 encoded image string), URL (image URL). |
| Binary Property | Name of the binary property containing the image data when using Binary Data input method. Default is "data". |
| Base64 Image | Base64 encoded image data string, used when Input Method is set to Base64. |
| Image URL | URL of the image to process, used when Input Method is set to URL. |
| Document Type Options | List of possible document type codes to guide identification. Each entry requires a document type code string (e.g., cartola_cc, cartola_tc). Multiple values can be provided. |
| Image Content Type | Content type of the image when using Base64 or URL input methods. Options include BMP, GIF, JPEG, PDF, PNG, WebP. Default is image/jpeg. |
| Document Metadata | Additional metadata about the document in JSON format. This is optional and can include any extra information relevant to the document being processed. |
Output
The node outputs a JSON object containing the response from the Docutray API's identify endpoint. This typically includes:
- Identification results indicating the recognized document type(s).
- Confidence scores or similarity metrics if applicable.
- Any additional metadata returned by the service related to the document identification.
If the input was binary data, the output will be purely JSON describing the identification result; no binary data is output by this node.
Dependencies
- Requires an active Docutray API key credential configured in n8n for authentication.
- Connects to the Docutray API at
https://app.docutray.com/api/identify. - Supports sending multipart form-data (for binary inputs) or JSON body (for Base64 or URL inputs).
Troubleshooting
- Invalid JSON in Document Metadata: If the
Document Metadatafield contains invalid JSON, the node will throw an error. Ensure the JSON is well-formed. - Missing or Incorrect Binary Property: When using binary data input, ensure the specified binary property exists and contains valid base64-encoded image data.
- Unsupported Image Content Type: The content type must match one of the supported types (BMP, GIF, JPEG, PDF, PNG, WebP). Using unsupported types may cause errors.
- API Authentication Errors: Verify that the API key credential is correctly set up and has necessary permissions.
- Empty or Invalid Document Type Codes: Provide valid document type codes in the
Document Type Optionsto guide the identification process effectively.
Links and References
- Docutray API Documentation (general reference for endpoints)
- n8n Documentation - Creating Custom Nodes
- JSON Validator – useful for validating the
Document MetadataJSON input