Docutray

Process documents and search knowledge bases with Docutray services

Actions3

Knowledge Base Actions
- Search
Document Actions
- Convert
- Identify

Overview

The node integrates with Docutray services to process documents, specifically focusing on identifying document types from images. It supports multiple input methods for the image data: binary data from a previous node, Base64 encoded images, or image URLs. The node sends the image along with possible document type codes and optional metadata to the Docutray API, which returns identification results.

This node is beneficial in scenarios where automated document classification or recognition is needed, such as:

Automatically sorting scanned documents by type.
Extracting document type information from uploaded images.
Integrating document processing workflows that require document type identification before further processing.

For example, a user might upload an image of a bank statement or credit card statement, provide expected document type codes, and receive structured identification results to route the document accordingly.

Properties

Name	Meaning
Input Method	Method to provide the image for identification. Options: `Binary Data` (image from previous node), `Base64` (Base64 encoded image string), `URL` (image URL).
Binary Property	Name of the binary property containing the image data when using `Binary Data` input method. Default is `"data"`.
Base64 Image	Base64 encoded image data string, used when `Input Method` is set to `Base64`.
Image URL	URL of the image to process, used when `Input Method` is set to `URL`.
Document Type Options	List of possible document type codes to guide identification. Each entry requires a document type code string (e.g., `cartola_cc`, `cartola_tc`). Multiple values can be provided.
Image Content Type	Content type of the image when using `Base64` or `URL` input methods. Options include BMP, GIF, JPEG, PDF, PNG, WebP. Default is `image/jpeg`.
Document Metadata	Additional metadata about the document in JSON format. This is optional and can include any extra information relevant to the document being processed.

Output

The node outputs a JSON object containing the response from the Docutray API's identify endpoint. This typically includes:

Identification results indicating the recognized document type(s).
Confidence scores or similarity metrics if applicable.
Any additional metadata returned by the service related to the document identification.

If the input was binary data, the output will be purely JSON describing the identification result; no binary data is output by this node.

Dependencies

Requires an active Docutray API key credential configured in n8n for authentication.
Connects to the Docutray API at https://app.docutray.com/api/identify.
Supports sending multipart form-data (for binary inputs) or JSON body (for Base64 or URL inputs).

Troubleshooting

Invalid JSON in Document Metadata: If the Document Metadata field contains invalid JSON, the node will throw an error. Ensure the JSON is well-formed.
Missing or Incorrect Binary Property: When using binary data input, ensure the specified binary property exists and contains valid base64-encoded image data.
Unsupported Image Content Type: The content type must match one of the supported types (BMP, GIF, JPEG, PDF, PNG, WebP). Using unsupported types may cause errors.
API Authentication Errors: Verify that the API key credential is correctly set up and has necessary permissions.
Empty or Invalid Document Type Codes: Provide valid document type codes in the Document Type Options to guide the identification process effectively.

Links and References

Docutray API Documentation (general reference for endpoints)
n8n Documentation - Creating Custom Nodes
JSON Validator – useful for validating the Document Metadata JSON input

DocutrayInstall