Package Information
Downloads: 569 weekly / 1,156 monthly
Latest Version: 0.2.2
Documentation
n8n-nodes-html-to-docx-via-pandoc
Convert an HTML string to a DOCX binary using a locally installed pandoc on the worker.
Installation
- Ensure pandoc is installed and available on PATH on the n8n worker host.
- macOS:
brew install pandoc - Ubuntu/Debian:
apt-get install pandoc - Windows: install from https://pandoc.org/installing.html and ensure pandoc.exe is on PATH
- macOS:
- Install this community node package into your n8n instance according to n8n docs.
Node options
- HTML Source: Direct string or JSON field path
- Output Binary Property: name of the binary property (default
data) - File Name: output file name (default
document.docx) - Pandoc Path: path to pandoc executable (default
pandoc) - Embed Resources: include local resources with
--embed-resources --standalonewhen supported - Resource Paths: directories for pandoc to search for resources
- Reference DOCX Source: None | Filesystem Path | Built-in Minimal Reference
- Reference DOCX Path: path to a reference DOCX template (when Filesystem Path)
- Clean Output Mode: enable cleanup (keeps only bold/italic, preserves lists/heading styles, removes bookmarks)
- Punctuation Normalization: Off | Conservative (default Conservative)
- Sanitize via CommonMark: Roundtrip to simplify structure
- Strip Formatting Except Bold/Italic, Remove Bookmarks, Collapse Empty Runs/Paragraphs, Ensure xml:space="preserve"
- Whitespace Policy: Collapse | Preserve Breaks (matching/sanitization only)
- Normalization Profile (JSON): advanced profile to share across nodes
- Timeout: seconds to wait for pandoc
- Additional Pandoc Arguments: advanced array of extra args (tokens only)
NormalizationProfile
A JSON-serializable structure shared with the DOCX diff node.
Default:
{
"punctuation": "conservative",
"whitespacePolicy": "collapse",
"collapseNBSP": true,
"normalizeQuotes": true,
"unicodeNormalization": "NFC"
}
Precedence:
- If Normalization Profile (JSON) is provided, its fields override defaults.
- Aggressive punctuation normalization is used for matching only and never mutates output.
Minimal DOCX constraints
When cleanup is enabled, output is sanitized to:
- Runs: retain only w:b and w:i
- Paragraph props: retain only w:pStyle and w:numPr
- Remove bookmarks
- Collapse empty runs/paragraphs (unless paragraph has pStyle/numPr)
Notes
- Requires pandoc 2.11+ for
--embed-resources. The node will continue without this flag on older versions. - For remote resources referenced by HTML, behavior may vary by pandoc setup. Prefer embedding or ensuring resources are available locally.
Development
- Node >= 20.15
- Install deps:
npm ci - Build:
npm run build - Dev compile:
npm run dev - Lint:
npm run lint - Tests:
- Unit:
npm run test:unit - Integration:
npm run test:integration(requires pandoc installed)
- Unit:
License
MIT