PDF4me icon

PDF4me

Comprehensive PDF and document processing: generate barcodes, convert files, extract data, manipulate images, and automate workflows with the PDF4ME API

Actions80

Overview

The node provides functionality to extract specific pages from a PDF document. It supports multiple input methods for the source PDF, including binary data from a previous node, a base64 encoded string, or a URL pointing to the PDF file. Users specify which pages to extract using flexible page number formats (single pages, ranges, or combinations). The extracted pages are then output as a new PDF document.

This node is beneficial in scenarios where you need to isolate certain pages from large PDF files for further processing, sharing, or archiving. For example:

  • Extracting invoice pages from a multi-page financial report.
  • Isolating specific chapters or sections from an eBook PDF.
  • Splitting scanned documents into individual pages for separate workflows.

Properties

Name Meaning
Input Data Type Choose how to provide the PDF file to extract pages from. Options:
• Binary Data — Use PDF file from previous node.
• Base64 String — Provide PDF content as a base64 encoded string.
• URL — Provide URL to PDF file.
Input Binary Field Name of the binary property that contains the PDF file (usually "data" for file uploads). Only shown if Input Data Type is Binary Data.
Base64 PDF Content Base64 encoded PDF document content. Only shown if Input Data Type is Base64 String.
PDF URL URL to the PDF file to extract pages from. Only shown if Input Data Type is URL.
Document Name Name of the output PDF document containing the extracted pages. Defaults to "output.pdf".
Page Numbers Page numbers to extract. Supports single pages (e.g., "1"), comma-separated lists (e.g., "1,3,5"), ranges (e.g., "2-4"), or combinations (e.g., "1, 2, 3-7"). Page indices start at 1.
Advanced Options Custom JSON profiles to adjust extra options for the API call. Useful for setting specific parameters supported by the underlying PDF processing service.

Output

The node outputs a JSON object representing the extracted PDF document. The main output field json contains metadata and references to the newly created PDF with the selected pages. If the original input was binary, the output will typically include the extracted PDF as binary data under a specified binary property.

Binary data output represents the actual PDF file containing only the extracted pages, ready for download, further processing, or storage.

Dependencies

  • Requires access to an external PDF processing API service capable of extracting pages from PDFs.
  • Needs proper API authentication configured in n8n (such as an API key credential).
  • Internet access may be required if providing the PDF via URL.

Troubleshooting

  • Common Issues:

    • Incorrect page number format can cause errors or unexpected results. Ensure page numbers are valid and within the range of the source PDF.
    • Providing an invalid or inaccessible URL will result in failure to fetch the PDF.
    • Missing or incorrect binary property name when using binary input will cause the node to fail to locate the PDF file.
    • Large PDF files might lead to timeouts depending on API limits.
  • Error Messages:

    • "Invalid page numbers" — Check the format and validity of the page numbers input.
    • "Failed to fetch PDF from URL" — Verify the URL is correct and accessible.
    • "Binary data not found" — Confirm the binary property name matches the incoming data.
    • API authentication errors — Ensure the API key or credentials are correctly set up in n8n.

Links and References

Discussion