PDF4me icon

PDF4me

Comprehensive PDF and document processing: generate barcodes, convert files, extract data, manipulate images, and automate workflows with the PDF4ME API

Actions80

Overview

The node provides functionality to extract specific pages from a PDF document. It supports multiple input methods for the source PDF, including binary data from a previous node, a base64 encoded string, or a URL pointing to the PDF file. Users specify which pages to extract by providing page numbers or ranges. The output is a new PDF document containing only the extracted pages.

This node is beneficial in scenarios where you need to isolate certain pages from large PDF files for further processing, sharing, or archiving. For example:

  • Extracting invoice pages from a multi-page PDF report.
  • Splitting a contract into separate sections.
  • Creating a summary document with selected pages from a larger manual.

Properties

Name Meaning
Input Data Type Choose how to provide the PDF file to extract pages from. Options:
• Binary Data (from previous node)
• Base64 String (provide PDF content as base64 encoded string)
• URL (provide URL to PDF file)
Input Binary Field Name of the binary property that contains the PDF file when using Binary Data input type. Usually "data" for file uploads.
Base64 PDF Content Base64 encoded PDF document content. Used when Input Data Type is set to Base64 String.
PDF URL URL to the PDF file to extract pages from. Used when Input Data Type is set to URL.
Document Name Name of the output PDF document after extraction. Defaults to "output.pdf".
Page Numbers Page numbers to extract from the PDF. Supports single pages (e.g., "1"), multiple pages separated by commas (e.g., "1,3,5"), and ranges (e.g., "2-4").
Advanced Options Collection of additional options. Currently supports:
• Custom Profiles: JSON string to adjust custom properties for API calls, allowing advanced configuration based on external API documentation.

Output

The node outputs a JSON object containing the extracted PDF document. The main output field json includes metadata and references to the resulting PDF file. The actual PDF content is provided as binary data attached to the output under the specified binary property name.

If the node processes multiple items, it returns an array of such outputs corresponding to each input item.

Dependencies

  • Requires access to an external PDF processing API service capable of extracting pages from PDFs.
  • Needs proper authentication credentials (such as an API key) configured in n8n to interact with the external PDF service.
  • Internet access is required if the input PDF is provided via URL or if the API service is cloud-based.

Troubleshooting

  • Common Issues:

    • Incorrect page number format can cause errors. Ensure page numbers are valid and within the range of the source PDF.
    • Providing an invalid or inaccessible URL will result in failure to fetch the PDF.
    • Missing or incorrect binary property name when using binary input may lead to no data being found.
    • Invalid base64 string input will cause decoding errors.
  • Error Messages:

    • "Page numbers out of range": Check that the requested pages exist in the source PDF.
    • "Failed to fetch PDF from URL": Verify the URL is correct and accessible.
    • "No binary data found in property" : Confirm the binary property name matches the input data.
    • "Invalid base64 content": Validate the base64 string format.

Resolving these typically involves verifying input parameters and ensuring the source PDF is correctly provided.

Links and References

Discussion