PDF Tools

Manipulate PDF files with various operations

Overview

The "Extract Pages" operation of the PDF Tools node allows users to extract specific pages from a PDF file and create a new PDF containing only those pages. This is useful when you want to isolate certain parts of a large PDF document without modifying the original file.

Common scenarios include:

  • Extracting chapters or sections from a book or report.
  • Creating a summary PDF with only relevant pages.
  • Splitting a multi-page invoice or contract into separate documents.

For example, if you have a 10-page PDF and want to extract pages 1, 3 to 5, and 7, you can specify "1,3-5,7" as the pages to extract. The node will output a new PDF containing just those pages.

Properties

Name Meaning
PDF Binary Field Name of the binary field containing the input PDF file to process.
Pages Pages to extract from the PDF. Can be a single page (e.g., "1"), multiple pages/ranges (e.g., "1,3-5"), or "all" for all pages.

Output

The node outputs a new PDF file in binary form under the binary property named "output" by default. The binary data contains the extracted pages as a standalone PDF document.

The output structure per item is:

{
  "json": {},
  "binary": {
    "output": {
      "data": "<base64-encoded PDF data>",
      "fileName": "output.pdf",
      "mimeType": "application/pdf"
    }
  }
}

No additional JSON fields are added for this operation.

Dependencies

  • Requires the input PDF to be provided as binary data with MIME type application/pdf.
  • Uses the pdf-lib library internally to manipulate PDF files.
  • No external API keys or services are required.
  • The node expects the PDF binary data to be available in the specified binary field of the input item.

Troubleshooting

  • Error: No binary data found for [field name]
    This means the specified binary field does not exist on the input item or does not contain valid data. Ensure the correct binary field name is set and that the input contains the PDF file.

  • Error: The file must be in PDF format (MIME type: application/pdf). Received: [type]
    The input file is not recognized as a PDF. Verify the input binary data is a valid PDF file with the correct MIME type.

  • Error: Invalid page numbers. Pages must be between 1 and [total pages]
    The pages parameter includes page numbers outside the range of the PDF. Check the page numbers and ranges specified.

  • General operation errors
    Make sure the pages string is correctly formatted (e.g., "1", "1,3-5", or "all") and that the PDF binary data is intact.

Links and References

Discussion