PDF4ME

Generate barcodes, convert URLs to PDF, convert PDFs to Word, convert JSON to Excel, crop images, and more using PDF4ME API

Actions5

Overview

This node converts PDF documents into editable Word (.docx) files. It supports multiple input methods for the source PDF, including binary data from previous nodes, base64-encoded strings, URLs, or local file paths. The node offers options to control conversion quality and OCR language for scanned PDFs or images within the document. It is useful in workflows where automated extraction and editing of PDF content are required, such as digitizing scanned contracts, converting reports for further editing, or integrating PDF content into document management systems.

Practical examples:

Automatically convert uploaded PDF invoices into Word documents for downstream processing.
Convert scanned PDF forms using OCR to editable Word format for data extraction.
Fetch a PDF from a URL and convert it to Word for content repurposing.

Properties

Name	Meaning
Input Data Type	Method to provide the PDF file: Binary Data (from previous node), Base64 String (encoded PDF content), URL (link to PDF), or File Path (local file system path).
Input Binary Field	Name of the binary property containing the PDF file when using Binary Data input type (default is "data").
Base64 PDF Content	Base64 encoded string of the PDF document content (used if Input Data Type is Base64 String).
PDF URL	URL pointing to the PDF file to convert (used if Input Data Type is URL).
Local File Path	Local file system path to the PDF file (used if Input Data Type is File Path).
Output File Name	Desired filename for the resulting Word document (default: "converted_document.docx").
Document Name	Reference name for the source PDF file (default: "document.pdf").
Quality Type	Conversion quality setting: "Draft" (faster, suitable for simple PDFs with clear text) or "Quality" (slower but more accurate, better for complex layouts).
OCR Language	Language used for Optical Character Recognition on scanned PDFs or images. Options include Arabic, Chinese (Simplified/Traditional), Danish, Dutch, English, Finnish, French, German, Italian, Japanese, Korean, Norwegian, Portuguese, Russian, Spanish, Swedish.
Advanced Options	Collection of additional settings:
- Custom Profiles	JSON string to customize API call properties for advanced configuration.
- Max Retries	Maximum number of polling attempts for asynchronous processing (higher for complex PDFs).
- Merge All Sheets	Whether to combine multiple pages into a single continuous document flow (true/false).
- Preserve Output Format	Whether to preserve original formatting when possible (true/false).
- Retry Delay (Seconds)	Base delay in seconds between polling attempts; actual delay increases exponentially.
- Use Async Processing	Enable asynchronous processing for better handling of large files (true/false).
- Use OCR When Needed	Enable OCR automatically when needed for scanned PDFs (true/false).

Output

The node outputs a JSON object containing the converted Word document. The main output field includes:

json: Metadata about the conversion result.
binary: The Word document file in binary form, named according to the specified output file name.

If the input was a PDF, the output binary data represents the corresponding .docx file ready for download or further processing in the workflow.

Dependencies

Requires access to an external PDF-to-Word conversion service API.
Needs appropriate API authentication credentials configured in n8n (e.g., an API key).
Network access to fetch PDFs from URLs if that input method is used.
For OCR functionality, the service must support the selected OCR languages.

Troubleshooting

Common issues:
- Invalid or inaccessible PDF URL leading to download failures.
- Incorrect binary property name causing missing input data errors.
- Unsupported or corrupted PDF files causing conversion errors.
- Insufficient API quota or invalid API credentials resulting in authorization errors.
- Long processing times for large or complex PDFs; consider adjusting retry and async options.
Error messages and resolutions:
- "Input binary data not found": Verify the binary property name matches the actual input.
- "Failed to fetch PDF from URL": Check URL accessibility and network connectivity.
- "Conversion failed due to unsupported format": Ensure the input file is a valid PDF.
- "API authentication error": Confirm API credentials are correctly set up in n8n.
- "Timeout waiting for conversion": Increase max retries or retry delay in advanced options.

Links and References

PDF4me API Documentation
General info on PDF to Word conversion and OCR technologies can be found on vendor websites or technical blogs related to document processing.