Actions80
- Extract Text From Word
- Find And Replace Text
- Convert PDF To Editable PDF Using OCR
- Create Swiss QR Bill
- Split PDF By Barcode
- Split PDF By Swiss QR
- Split PDF By Text
- Split PDF Regular
- Create PDF/A
- Convert HTML To PDF
- Convert Markdown To PDF
- Upload File To PDF4me
- Add Attachment To PDF
- Add Barcode To PDF
- Add Form Fields To PDF
- Fill PDF Form
- Add HTML Header Footer
- Add Image Stamp To PDF
- Add Margin To PDF
- Add Page Number To PDF
- Add Text Stamp To PDF
- AI-Invoice Parser
- AI-Process HealthCard
- AI-Process Contract
- Generate Barcode
- Classify Document
- Parse Document
- Linearize PDF
- Flatten PDF
- Convert To PDF
- Json To Excel
- Convert PDF To Excel
- Convert PDF To Word
- Convert PDF To PowerPoint
- Convert VISIO
- Crop Image
- Delete Blank Pages From PDF
- Delete Unwanted Pages From PDF
- Extract Pages
- Merge Multiple PDFs
- Overlay PDFs
- Rotate Document
- Rotate Page
- Sign PDF
- URL to PDF
- Add Image Watermark To Image
- Add Text Watermark To Image
- Compress Image
- Convert Image Format
- Create Images From PDF
- Flip Image
- Get Image Metadata
- Image Extract Text
- Remove EXIF Tags From Image
- Replace Text With Image
- Replace Text With Image In Word
- Resize Image
- Rotate Image
- Rotate Image By EXIF Data
- Compress PDF
- Get PDF Metadata
- Repair PDF Document
- Get Document From Pdf4me
- Update Hyperlinks Annotation
- Protect Document
- Unlock PDF
- Disable Tracking Changes In Word
- Enable Tracking Changes In Word
- Generate Document Single
- Generate Documents Multiple
- Get Tracking Changes In Word
- Read Barcode From Image
- Read Barcode From PDF
- Read SwissQR Code
- Extract Form Data From PDF
- Extract Pages From PDF
- Extract Attachment From PDF
- Extract Text By Expression
- Extract Table From PDF
- Extract Resources
Overview
The node provides a "Classify Document" operation that classifies PDF documents using different input methods. It supports providing the PDF file as binary data from a previous node, as a base64 encoded string, or via a URL to the PDF file. This classification can be used in workflows where automatic document categorization is needed, such as sorting invoices, contracts, or other document types based on their content.
Practical examples include:
- Automatically classifying incoming scanned PDFs in an invoice processing workflow.
- Categorizing contracts or legal documents fetched from URLs for further automated processing.
- Using base64 encoded PDF content received from APIs or other sources to classify documents without saving files locally.
Properties
| Name | Meaning |
|---|---|
| Input Data Type | Choose how to provide the PDF file to classify. Options: Binary Data (PDF from previous node), Base64 String (base64 encoded PDF content), URL (link to PDF file). |
| Input Binary Field | Name of the binary property containing the PDF file when using Binary Data input type. Usually "data" for file uploads. |
| Base64 PDF Content | Base64 encoded PDF document content to classify, used when Input Data Type is set to Base64 String. |
| PDF URL | URL to the PDF file to classify, used when Input Data Type is set to URL. |
| Document Name | Name of the document, used internally during processing. Defaults to "document.pdf". |
| Advanced Options | Collection of optional advanced settings. Includes "Custom Profiles" where JSON can be provided to adjust custom properties and API call options specific to certain APIs. Example usage includes setting output data format. |
Output
The node outputs JSON data representing the classification result of the processed PDF document. The exact structure depends on the classification API response but typically includes information about the document category or type detected.
If the node processes binary data, it may also output binary fields representing the processed document or related files, but this is not explicitly detailed here.
Dependencies
- Requires access to an external PDF processing/classification API service.
- Needs appropriate API authentication credentials configured in n8n (e.g., an API key).
- Network access to fetch PDF files if using the URL input method.
Troubleshooting
Common issues:
- Providing incorrect or missing binary data field name when using Binary Data input type.
- Invalid base64 string format causing decoding errors.
- Inaccessible or invalid PDF URL leading to download failures.
- Missing or invalid API credentials resulting in authentication errors.
Error messages and resolutions:
- "Binary property not found": Verify the binary field name matches the actual property in the input data.
- "Invalid base64 content": Ensure the base64 string is correctly encoded and complete.
- "Failed to fetch PDF from URL": Check the URL accessibility and correctness.
- "Authentication failed": Confirm API credentials are properly set up and valid.
Links and References
- PDF4me API Documentation
- General info on base64 encoding: https://en.wikipedia.org/wiki/Base64
- n8n documentation on handling binary data: https://docs.n8n.io/nodes/working-with-binary-data/