Actions2
Overview
This node integrates with the Mistral OCR API to extract text and structured data from various document formats including PDFs, images, Word, PowerPoint, RTF, EPUB, LaTeX, and Jupyter Notebooks. It supports OCR with annotations, allowing users to define custom or pre-configured document templates for data extraction, and optionally include bounding box annotations for visual elements like charts and tables. The node handles file uploads, manages rate limits with retries, and returns detailed OCR results with metadata. It is useful for automating data extraction from invoices, contracts, letters, receipts, ID documents, research papers, and other document types.
Use Case Examples
- Extract structured invoice data such as amounts, dates, and customer info from PDF invoices.
- Process scanned contracts to extract parties, dates, and terms with bounding box annotations for tables and figures.
- Analyze research papers to extract titles, authors, abstracts, and keywords using a custom template.
Properties
| Name | Meaning |
|---|---|
| Binary Property | Name of the binary property containing the document to process. Supported formats include PDF, images (PNG/JPEG/GIF), Word (.docx), PowerPoint (.pptx), RTF, EPUB, LaTeX, and Jupyter Notebooks (.ipynb). This is required to provide the document data for OCR processing. |
| Model | Select the Mistral OCR model version to use for processing. Options include the latest recommended version, a specific May 2025 version, or a legacy March 2025 version, each offering different accuracy and compatibility. |
| Document Template | Choose a pre-configured document template or define custom fields for data extraction. Templates include invoice, letter, contract, receipt, ID document, research paper, or custom fields. |
| Custom Fields | Define multiple custom fields with names, types (number, string, date, array, boolean), descriptions, and whether they are required. Used only if Document Template is set to custom. |
| Include Element Analysis | Boolean option to include analysis of visual elements such as charts, figures, and tables in the document. |
| Advanced: Custom JSON Schema | Enable advanced mode to manually define JSON schemas for document and bounding box annotations. |
| BBox Annotation Schema | JSON schema defining the structure for visual element annotations when advanced mode and element analysis are enabled. |
| Document Annotation Schema | JSON schema defining the structure for document-level annotations when advanced mode is enabled. |
| Pages to Process | Specify pages to process for document annotations, e.g., '0-7' or '0,1,2,3'. Maximum 8 pages allowed. |
| Options | Additional options including whether to include base64 encoded image in the response and file expiry time in hours (1-168). |
Output
JSON
operation- The OCR operation performed (e.g., 'ocrWithAnnotations').uploadedFileId- ID of the uploaded file in the Mistral API.signedUrl- Signed URL to access the uploaded document.processedAt- Timestamp when the document was processed.documentTemplate- The document template used for extraction (if applicable).includeBboxAnnotations- Boolean indicating if bounding box annotations were included.advancedMode- Boolean indicating if advanced mode was enabled.- ``
*
* `` - Additional OCR results and metadata returned by the Mistral OCR API.
Dependencies
- Mistral OCR API
- An API key credential for Mistral OCR API authentication
Troubleshooting
- Ensure the binary property specified contains valid document data in supported formats; errors occur if missing or unsupported format is detected.
- File size must not exceed the maximum allowed size (e.g., 50MB); large files will cause errors.
- Rate limit errors (HTTP 429) are handled with retries, but persistent rate limits indicate exceeding API plan capacity; consider upgrading the plan or reducing request frequency.
- Invalid JSON in custom schemas will cause errors; validate JSON syntax before use.
- If the node throws errors about missing or corrupted binary data, verify the input data source and format.
Links
- Mistral OCR API Documentation - Official documentation for the Mistral OCR API including usage, models, and schema definitions.