Actions80
- Add Attachment To PDF
- Add Barcode To PDF
- Add Form Fields To PDF
- Add HTML Header Footer
- Add Image Stamp To PDF
- Add Image Watermark To Image
- Add Margin To PDF
- Add Page Number To PDF
- Add Text Stamp To PDF
- Add Text Watermark To Image
- AI-Invoice Parser
- AI-Process Contract
- AI-Process HealthCard
- Classify Document
- Compress Image
- Compress PDF
- Convert HTML To PDF
- Convert Image Format
- Convert JSON To Excel
- Convert Markdown To PDF
- Convert PDF To Editable PDF Using OCR
- Convert PDF To Excel
- Convert PDF To PowerPoint
- Convert PDF To Word
- Convert To PDF
- Convert URL to PDF
- Convert VISIO
- Convert Word to PDF Form
- Create Images From PDF
- Create PDF/A
- Create Swiss QR Bill
- Crop Image
- Delete Blank Pages From PDF
- Delete Unwanted Pages From PDF
- Disable Tracking Changes In Word
- Enable Tracking Changes In Word
- Extract Attachment From PDF
- Extract Form Data From PDF
- Extract Pages From PDF
- Extract Resources
- Extract Table From PDF
- Extract Text By Expression
- Extract Text From Word
- Fill PDF Form
- Find And Replace Text
- Flip Image
- Flatten PDF
- Generate Barcode
- Generate Document Single
- Generate Documents Multiple
- Get Document From Pdf4me
- Get Image Metadata
- Get PDF Metadata
- Get Tracking Changes In Word
- Image Extract Text
- Linearize PDF
- Merge Multiple PDFs
- Overlay PDFs
- Parse Document
- Protect PDF
- Read Barcode From Image
- Read Barcode From PDF
- Read SwissQR Code
- Remove EXIF Tags From Image
- Repair PDF Document
- Replace Text With Image
- Replace Text With Image In Word
- Resize Image
- Rotate Document
- Rotate Image
- Rotate Image By EXIF Data
- Rotate PDF Page
- Sign PDF
- Split PDF By Barcode
- Split PDF By Swiss QR
- Split PDF By Text
- Split PDF Regular
- Unlock PDF
- Update Hyperlinks Annotation
- Upload File To PDF4me
Overview
This node operation "Extract Table From PDF" allows users to extract tabular data from PDF documents. It is useful in scenarios where structured data is embedded within PDFs, such as invoices, reports, or forms, and you want to convert these tables into machine-readable JSON or other formats for further processing or analysis.
Practical examples include:
- Extracting invoice line items from a PDF invoice.
- Parsing tables from financial reports or statements.
- Converting survey results or form responses stored in PDF tables into structured data.
The node supports multiple ways to provide the PDF input: as binary data from a previous node, as a base64 encoded string, or via a URL pointing to the PDF file.
Properties
| Name | Meaning |
|---|---|
| Input Data Type | Choose how to provide the PDF file to extract tables from. Options: • Binary Data (PDF file from previous node) • Base64 String (PDF content as base64 encoded string) • URL (URL to PDF file) |
| Input Binary Field | Name of the binary property containing the PDF file (usually "data" for file uploads). Required if Input Data Type is Binary Data. |
| Base64 PDF Content | Base64 encoded PDF document content. Required if Input Data Type is Base64 String. |
| PDF URL | URL to the PDF file to extract tables from. Required if Input Data Type is URL. |
| Document Name | Name of the document used for processing. Defaults to "document.pdf". |
| Advanced Options | Collection of advanced options. Currently supports: • Custom Profiles: A JSON string to adjust custom properties for API calls, e.g., { 'outputDataFormat': 'json' }. Useful to customize extraction behavior based on external API documentation. |
Output
The output contains the extracted table data from the PDF in JSON format under the json field of each item. The structure typically includes rows and columns representing the table contents parsed from the PDF.
If the node supports binary output (not explicitly shown here), it would represent extracted files or related binary data, but this operation focuses on JSON extraction of tables.
Dependencies
- Requires access to an external PDF processing API service capable of extracting tables from PDFs.
- Needs appropriate API authentication configured in n8n (e.g., an API key credential).
- Network access to fetch PDF files if using the URL input option.
Troubleshooting
Common issues:
- Providing incorrect input data type or missing required fields for the selected input type.
- Invalid or inaccessible PDF URLs leading to download failures.
- Malformed base64 strings causing decoding errors.
- Unsupported or corrupted PDF files that the external API cannot process.
- Incorrectly formatted custom profiles JSON causing API call failures.
Error messages and resolutions:
- "Failed to fetch PDF from URL": Check the URL accessibility and correctness.
- "Invalid base64 content": Verify the base64 string is properly encoded without extra characters.
- "API authentication error": Ensure the API key or credentials are correctly set up in n8n.
- "Table extraction failed": Confirm the PDF contains recognizable tables and try adjusting advanced options or profiles.
Links and References
- PDF4me API Documentation — For details on custom profiles and advanced options.
- General info on PDF table extraction techniques and best practices can be found in various PDF processing libraries and services documentation.