PDF Tools

Manipulate PDF files with various operations

Overview

The node "PDF Tools" provides a variety of operations to manipulate PDF files within an n8n workflow. It supports tasks such as adding images or watermarks, deleting or extracting pages, merging multiple PDFs, reading metadata, reordering or rotating pages, splitting PDFs, and extracting text content.

This node is beneficial in scenarios where automated PDF processing is required, for example:

  • Reordering pages in a PDF document before sending it out.
  • Extracting specific pages from large reports.
  • Adding company logos or watermarks to official documents.
  • Merging multiple PDFs into a single file for easier distribution.
  • Extracting text for indexing or searching purposes.

Practical example: You receive scanned PDFs and want to reorder the pages to correct their sequence automatically before archiving them.

Properties

Name Meaning
PDF Binary Field Name of the binary field containing the input PDF file.
New Page Order New order of pages specified as comma-separated numbers (e.g., "3,1,2" to reorder pages).

Note: The above properties are specific to the Reorder Pages operation. The node supports many other operations with additional properties not listed here.

Output

The output contains the processed PDF file in binary form under a field named output. This binary data includes:

  • data: Base64 encoded string of the resulting PDF file after the operation.
  • fileName: The filename assigned to the output PDF, typically "output.pdf".
  • mimeType: Always "application/pdf" indicating the file type.

No textual JSON output is produced for this operation; the main result is the reordered PDF binary.

Dependencies

  • Uses the pdf-lib library for PDF manipulation.
  • Uses pdf-parse for text extraction operations.
  • Requires input PDFs to be provided as binary data fields in the workflow.
  • No external API keys or services are needed; all processing is done locally within the node.

Troubleshooting

  • Missing binary data error: If the specified PDF binary field does not exist or is empty in the input item, the node will throw an error. Ensure the binary field name matches exactly the field containing the PDF.
  • Invalid MIME type error: The node expects the input file to have MIME type application/pdf. Providing other file types will cause an error.
  • Invalid page order format: The new page order must be a comma-separated list of valid page numbers within the PDF's page count. Invalid or out-of-range page numbers will trigger errors.
  • Empty new page order: The property for new page order is mandatory for the reorder operation. Omitting it will cause the node to fail.

To resolve these issues, verify that:

  • The input binary field names are correctly set.
  • The input files are valid PDFs.
  • The new page order string is properly formatted and references existing pages.

Links and References

Discussion