PDF-LIB icon

PDF-LIB

Perform operations on PDF files (get info, split)

Overview

This node allows you to perform operations on PDF files, specifically extracting information about a PDF or splitting a PDF into smaller chunks by pages. It is useful when you need to analyze PDF documents to get metadata like the number of pages or when you want to divide large PDFs into smaller parts for easier processing or distribution.

Practical examples:

  • Extracting the total page count from uploaded PDF invoices before further processing.
  • Splitting a large PDF report into multiple smaller PDFs, each containing a fixed number of pages, to send them separately or upload in batches.

Properties

Name Meaning
Operation Choose between "Get PDF Info" (extract info) or "Split PDF" (split into page chunks).
Binary Property Name of the binary property that contains the PDF file data. Default is "data".
Chunk Size Number of pages per chunk when splitting a PDF. Only shown if operation is "Split PDF".

Output

  • For Get PDF Info operation:
    • json output contains:
      • pageCount: Number of pages in the PDF.
      • operation: The string "getInfo".
      • fileName: Original file name of the PDF or "unknown.pdf" if not available.
  • For Split PDF operation:
    • json output contains:
      • count: Number of PDF chunks created.
      • pageRanges: Array of strings representing page ranges for each chunk (e.g., "1-3").
      • operation: The string "split".
      • originalFileName: Original file name or "unknown.pdf".
    • binary output contains multiple binary properties named pdf1, pdf2, etc., each holding a chunk of the split PDF with:
      • data: Base64 encoded PDF chunk.
      • fileName: Filename like split_1.pdf, split_2.pdf, etc.
      • mimeType: Always "application/pdf".

Dependencies

  • Uses the pdf-lib library to load, read, and manipulate PDF files.
  • Requires input items to have binary data containing the PDF file.
  • No external API keys or services are needed; all processing is done locally within the node.

Troubleshooting

  • Error: No binary data property 'X' found on item
    This means the specified binary property does not exist on the input item. Ensure the binary property name matches exactly the property containing the PDF file.
  • Invalid PDF or corrupted file errors
    If the PDF cannot be loaded, verify the input binary data is a valid PDF file.
  • Chunk size issues when splitting
    Setting chunk size larger than the number of pages will result in a single chunk equal to the whole PDF. Use sensible chunk sizes.
  • If the node fails but "Continue On Fail" is enabled, it outputs an error message in the JSON field instead of stopping execution.

Links and References

Discussion