PDF to Word

将PDF文件转换为Word文档

Overview

This node converts one or multiple PDF files into a single Word document (.docx). It is useful in scenarios where users need to extract editable text and formatting from PDFs for further editing, reporting, or archiving. For example, a user might automate the conversion of scanned contracts or reports stored as PDFs into Word documents for easier review and modification.

The node accepts binary PDF files as input, sends them to an external API service that performs the conversion, and outputs the resulting Word document as binary data.

Properties

Name	Meaning
Operation	The action to perform; here it supports "Convert PDFs to Word" which merges multiple PDFs into one Word document.
API Endpoint	The URL of the external API service that handles PDF to Word conversion.
Request Timeout (Ms)	Maximum time in milliseconds to wait for the API response before timing out.

Output

json: Contains metadata about the conversion result:
- success: Boolean indicating if the conversion succeeded.
- message: A success message including how many PDFs were converted.
- originalFileCount: Number of PDF files processed.
- outputFileName: The generated Word document's filename.
binary.data: The actual Word document file in .docx format, with MIME type application/vnd.openxmlformats-officedocument.wordprocessingml.document. This binary output can be used downstream for saving, emailing, or further processing.

Dependencies

Requires access to an external PDF-to-Word conversion API endpoint, configurable via the "API Endpoint" property.
Needs the ability to send HTTP POST requests with multipart/form-data containing PDF files.
The node expects input items to contain binary data representing PDF files.
No internal credentials are hardcoded; users must provide a valid API endpoint that does not require additional authentication or handle authentication externally.

Troubleshooting

No binary data found in input:
Error message: "Item X does not have binary data. Please ensure the input contains PDF files as binary data."
Resolution: Verify that the input to the node includes binary PDF files. If some items lack binary data, enable "Continue On Fail" to skip those items without stopping the workflow.
No PDF files detected in binary data:
Error message: "No processable PDF files found. Please ensure the input binary data contains PDF files."
Resolution: Confirm that the binary data has correct MIME types or filenames ending with .pdf.
Failed to read binary data:
Error message: "Unable to read binary data: [error details]"
Resolution: Check that the binary data is accessible and correctly formatted.
API request errors or unexpected responses:
Error message: "PDF to Word conversion failed: [error message]" or "API returned an unexpected response format."
Resolution: Ensure the API endpoint is reachable, supports the expected request format, and returns valid responses. Adjust timeout settings if necessary.
Timeouts:
If the conversion takes longer than the configured timeout, increase the "Request Timeout (Ms)" value.

Links and References

Multipart/form-data specification
Microsoft Word Open XML Format
Example external PDF to Word conversion APIs (not specific to this node):
- https://www.zamzar.com/api/
- https://cloudmersive.com/pdf-to-word-api

Note: This summary is based on static analysis of the provided source code and property definitions. Runtime behavior depends on the external API service used.

PDF to WordInstall