Actions80
- Extract Text From Word
- Find And Replace Text
- Convert PDF To Editable PDF Using OCR
- Create Swiss QR Bill
- Split PDF By Barcode
- Split PDF By Swiss QR
- Split PDF By Text
- Split PDF Regular
- Create PDF/A
- Convert HTML To PDF
- Convert Markdown To PDF
- Upload File To PDF4me
- Add Attachment To PDF
- Add Barcode To PDF
- Add Form Fields To PDF
- Fill PDF Form
- Add HTML Header Footer
- Add Image Stamp To PDF
- Add Margin To PDF
- Add Page Number To PDF
- Add Text Stamp To PDF
- AI-Invoice Parser
- AI-Process HealthCard
- AI-Process Contract
- Generate Barcode
- Classify Document
- Parse Document
- Linearize PDF
- Flatten PDF
- Convert To PDF
- Json To Excel
- Convert PDF To Excel
- Convert PDF To Word
- Convert PDF To PowerPoint
- Convert VISIO
- Crop Image
- Delete Blank Pages From PDF
- Delete Unwanted Pages From PDF
- Extract Pages
- Merge Multiple PDFs
- Overlay PDFs
- Rotate Document
- Rotate Page
- Sign PDF
- URL to PDF
- Add Image Watermark To Image
- Add Text Watermark To Image
- Compress Image
- Convert Image Format
- Create Images From PDF
- Flip Image
- Get Image Metadata
- Image Extract Text
- Remove EXIF Tags From Image
- Replace Text With Image
- Replace Text With Image In Word
- Resize Image
- Rotate Image
- Rotate Image By EXIF Data
- Compress PDF
- Get PDF Metadata
- Repair PDF Document
- Get Document From Pdf4me
- Update Hyperlinks Annotation
- Protect Document
- Unlock PDF
- Disable Tracking Changes In Word
- Enable Tracking Changes In Word
- Generate Document Single
- Generate Documents Multiple
- Get Tracking Changes In Word
- Read Barcode From Image
- Read Barcode From PDF
- Read SwissQR Code
- Extract Form Data From PDF
- Extract Pages From PDF
- Extract Attachment From PDF
- Extract Text By Expression
- Extract Table From PDF
- Extract Resources
Overview
The node provides functionality to convert PDF documents into Excel spreadsheets. It supports multiple input methods for the PDF file, including binary data from a previous node, base64 encoded content, or a URL pointing to the PDF. The conversion process can be customized by selecting quality levels (Draft or Quality), enabling OCR for scanned PDFs, and choosing whether to merge all sheets into one or keep them separate. This node is useful in scenarios where tabular data embedded in PDFs needs to be extracted for further analysis, reporting, or integration with spreadsheet-based workflows.
Practical examples:
- Extracting financial tables from monthly PDF reports into Excel for data analysis.
- Converting scanned invoices or receipts into editable Excel files using OCR.
- Automating data extraction from PDF forms submitted online and converting them into structured Excel sheets.
Properties
| Name | Meaning |
|---|---|
| Input Data Type | Choose how to provide the PDF file to convert to Excel. Options: Binary Data (from previous node), Base64 String (provide encoded content), URL (link to PDF file). |
| Input Binary Field | Name of the binary property containing the PDF file when using Binary Data input type (usually "data"). |
| Base64 PDF Content | Base64 encoded string representing the PDF document content, used when Input Data Type is Base64 String. |
| PDF URL | URL to the PDF file to convert, used when Input Data Type is URL. |
| Quality Type | Select conversion quality: Draft (faster, suitable for simple PDFs with clear tables) or Quality (slower but more accurate, better for complex layouts). |
| Language | OCR language setting for text recognition in images or scanned PDFs (e.g., English). |
| Merge All Sheets | Boolean option to combine all Excel sheets into one sheet (true) or keep them as separate sheets (false). |
| Output Format | Boolean option to preserve original formatting when possible during conversion. |
| OCR When Needed | Boolean option to enable OCR (Optical Character Recognition) for scanned PDFs that require text extraction. |
| Output File Name | Name for the resulting Excel output file (default: "PDF_to_EXCEL_output.xlsx"). |
| Document Name | Name of the source PDF file for reference purposes (default: "output.pdf"). |
Output
The node outputs JSON data representing the result of the PDF to Excel conversion. Typically, this includes the Excel file content either as binary data attached to the output item or as a downloadable file. The output file respects the naming specified in the "Output File Name" property.
If the node supports binary output, it will contain the Excel file in binary form, which can be passed to subsequent nodes for saving, emailing, or further processing.
Dependencies
- Requires access to an external PDF processing service or API capable of converting PDFs to Excel format, including OCR capabilities.
- Needs proper authentication credentials (such as an API key or token) configured in n8n to interact with the external service.
- Network access to fetch PDF files if URLs are used as input.
- The OCR language setting requires the service to support the specified language.
Troubleshooting
Common issues:
- Invalid or inaccessible PDF URL leading to download failures.
- Incorrect binary property name causing the node to not find the PDF file in input data.
- Unsupported PDF formats or encrypted PDFs may cause conversion errors.
- OCR failures if the language is not supported or the PDF quality is too low.
- Large or complex PDFs might lead to timeouts or slow processing.
Error messages and resolutions:
- "File not found in binary property": Verify the binary property name matches the actual input.
- "Failed to download PDF from URL": Check URL accessibility and network connectivity.
- "Conversion failed due to unsupported format": Ensure the PDF is not corrupted or encrypted.
- "OCR language not supported": Use a supported language code or disable OCR if not needed.
- "API authentication error": Confirm that the API key or credentials are correctly set up in n8n.
Links and References
- PDF to Excel Conversion Services (example external service)
- Optical Character Recognition (OCR) Overview
- Excel File Format Specification