Overview
This node converts documents between various formats using the Pandoc tool. It is useful when you need to transform document content from one markup or file format to another, such as converting Markdown files to HTML, Microsoft Word documents to PDF, or LaTeX to plain text. The node supports extracting embedded images during conversion and outputs them separately.
Practical examples include:
- Converting Markdown notes into polished HTML pages.
- Transforming DOCX reports into PDFs for distribution.
- Extracting images from complex documents while converting their text content.
Properties
| Name | Meaning |
|---|---|
| Binary Property | Name of the binary property containing the input file to convert (e.g., "data"). |
| From Format | Input document format. Options: Markdown, HTML, Microsoft Word, LaTeX, Plain Text. |
| To Format | Desired output document format. Options: Markdown, HTML, Microsoft Word, PDF, LaTeX, Plain Text. |
| Additional Options | Extra command-line options for Pandoc (e.g., --standalone --toc) to customize conversion. |
Output
The node produces two outputs:
Converted Document (main output):
- Contains the converted document in binary form.
- The binary data is base64 encoded.
- Includes metadata fields:
mimeType: MIME type corresponding to the output format (e.g.,application/pdffor PDF).fileName: Generated filename with appropriate extension based on output format.
- The JSON part passes through the original input JSON unchanged.
Extracted Images (optional second output):
- Contains any images extracted from the source document during conversion.
- Each image is provided as a separate binary item with:
- Base64 encoded image data.
- MIME type inferred from the image file extension.
- A generated unique filename.
- JSON includes metadata about the source document and original image name.
Dependencies
- Requires the external
pandoccommand-line tool installed on the system where n8n runs. - Uses Node.js modules:
node-pandoc,fs/promises,path,os,uuid, andmime-types. - No special API keys or credentials are needed.
- Temporary files are created in the OS temp directory during processing and cleaned up afterward.
Troubleshooting
- No binary data found error: Occurs if the specified binary property does not exist or is empty in the input item. Ensure the correct binary property name is set and that the input contains valid binary data.
- Pandoc execution errors: If Pandoc is not installed or accessible, the node will fail. Verify Pandoc installation and PATH configuration.
- File permission issues: The node writes temporary files; lack of write permissions in the temp directory can cause failures.
- Unsupported format errors: Using unsupported input or output formats may cause Pandoc to error out. Confirm supported formats match those listed in properties.
- Additional options syntax: Incorrectly formatted additional options string may cause Pandoc to fail. Use proper command-line option syntax separated by spaces.