Actions80
- Extract Text From Word
- Find And Replace Text
- Convert PDF To Editable PDF Using OCR
- Create Swiss QR Bill
- Split PDF By Barcode
- Split PDF By Swiss QR
- Split PDF By Text
- Split PDF Regular
- Create PDF/A
- Convert HTML To PDF
- Convert Markdown To PDF
- Upload File To PDF4me
- Add Attachment To PDF
- Add Barcode To PDF
- Add Form Fields To PDF
- Fill PDF Form
- Add HTML Header Footer
- Add Image Stamp To PDF
- Add Margin To PDF
- Add Page Number To PDF
- Add Text Stamp To PDF
- AI-Invoice Parser
- AI-Process HealthCard
- AI-Process Contract
- Generate Barcode
- Classify Document
- Parse Document
- Linearize PDF
- Flatten PDF
- Convert To PDF
- Json To Excel
- Convert PDF To Excel
- Convert PDF To Word
- Convert PDF To PowerPoint
- Convert VISIO
- Crop Image
- Delete Blank Pages From PDF
- Delete Unwanted Pages From PDF
- Extract Pages
- Merge Multiple PDFs
- Overlay PDFs
- Rotate Document
- Rotate Page
- Sign PDF
- URL to PDF
- Add Image Watermark To Image
- Add Text Watermark To Image
- Compress Image
- Convert Image Format
- Create Images From PDF
- Flip Image
- Get Image Metadata
- Image Extract Text
- Remove EXIF Tags From Image
- Replace Text With Image
- Replace Text With Image In Word
- Resize Image
- Rotate Image
- Rotate Image By EXIF Data
- Compress PDF
- Get PDF Metadata
- Repair PDF Document
- Get Document From Pdf4me
- Update Hyperlinks Annotation
- Protect Document
- Unlock PDF
- Disable Tracking Changes In Word
- Enable Tracking Changes In Word
- Generate Document Single
- Generate Documents Multiple
- Get Tracking Changes In Word
- Read Barcode From Image
- Read Barcode From PDF
- Read SwissQR Code
- Extract Form Data From PDF
- Extract Pages From PDF
- Extract Attachment From PDF
- Extract Text By Expression
- Extract Table From PDF
- Extract Resources
Overview
The node provides functionality to convert PDF documents into editable Word files (.docx). It supports multiple input methods for the source PDF, including binary data from a previous node, base64-encoded strings, or direct URLs to PDF files. The conversion process can be customized with quality settings and OCR language options to handle scanned or image-based PDFs effectively.
This node is beneficial in scenarios where automated workflows require extracting editable content from PDFs, such as document editing, archiving, or further processing in word processors. For example, it can be used to convert contracts received as PDFs into Word documents for review and modification, or to digitize scanned reports by applying OCR during conversion.
Properties
| Name | Meaning |
|---|---|
| Input Data Type | Choose how to provide the PDF file to convert. Options: Binary Data (from previous node), Base64 String (directly provide encoded content), URL (link to PDF file). |
| Input Binary Field | Name of the binary property containing the PDF file when using Binary Data input type (usually "data"). |
| Base64 PDF Content | Base64 encoded string representing the PDF document content, used if Input Data Type is Base64 String. |
| PDF URL | URL pointing to the PDF file to convert, used if Input Data Type is URL. |
| Output File Name | Desired name for the output Word document file (e.g., "converted_document.docx"). |
| Document Name | Name of the source PDF file for reference purposes (e.g., "document.pdf"). |
| Quality Type | Conversion quality setting. Options: Draft (faster, suitable for simple PDFs with clear text), Quality (slower but more accurate, better for complex layouts). |
| OCR Language | Language used for Optical Character Recognition when converting scanned or image-based PDFs. Supported languages include Arabic, Chinese (Simplified/Traditional), Danish, Dutch, English, Finnish, French, German, Italian, Japanese, Korean, Norwegian, Portuguese, Russian, Spanish, Swedish. |
| Advanced Options | Collection of additional optional settings: • Custom Profiles: JSON string to adjust custom API call properties. • Max Retries: Maximum polling attempts for async processing. • Merge All Sheets: Combine multiple pages into a single document flow. • Preserve Output Format: Whether to keep original formatting. • Retry Delay (Seconds): Base delay between polling attempts. • Use Async Processing: Enable asynchronous handling for large files. • Use OCR When Needed: Enable OCR for scanned PDFs. |
Output
The node outputs an array of items where each item contains a json field with metadata and a binary field holding the converted Word document file. The binary data represents the .docx file resulting from the PDF conversion.
json: Contains metadata about the conversion process or any relevant information.- Binary data field (default name usually "data"): Contains the Word document file content ready for download or further processing.
Dependencies
- Requires access to an external PDF conversion service API that supports PDF to Word conversion with OCR capabilities.
- Needs proper API authentication credentials configured in n8n to authorize requests to the conversion service.
- Network access to fetch PDF files if using URL input type.
- Optional configuration of advanced options may require familiarity with the external API's profile settings.
Troubleshooting
Common Issues:
- Invalid or inaccessible PDF URL leading to failed downloads.
- Incorrect binary property name causing missing input data.
- Unsupported PDF formats or encrypted PDFs that cannot be converted.
- Insufficient API quota or invalid API credentials causing authorization errors.
- Large or complex PDFs exceeding default retry limits or timeouts.
Error Messages & Resolutions:
- "File not found" or "Unable to download PDF": Verify the URL is correct and accessible.
- "Invalid binary data": Ensure the binary property name matches the actual input binary field.
- "Conversion failed due to unsupported format": Check if the PDF is corrupted or encrypted; try preprocessing or unlocking the PDF.
- "Authentication error": Confirm API key or token is correctly set up in n8n credentials.
- Timeout or polling exceeded max retries: Increase "Max Retries" and "Retry Delay" in advanced options for large files.
Links and References
- PDF4me API Documentation — Reference for custom profiles and advanced API options.
- Optical Character Recognition (OCR) Languages — Information on supported OCR languages.
- Microsoft Word DOCX Format — Details about the Word document format output.