Actions80
- Extract Text From Word
- Find And Replace Text
- Convert PDF To Editable PDF Using OCR
- Create Swiss QR Bill
- Split PDF By Barcode
- Split PDF By Swiss QR
- Split PDF By Text
- Split PDF Regular
- Create PDF/A
- Convert HTML To PDF
- Convert Markdown To PDF
- Upload File To PDF4me
- Add Attachment To PDF
- Add Barcode To PDF
- Add Form Fields To PDF
- Fill PDF Form
- Add HTML Header Footer
- Add Image Stamp To PDF
- Add Margin To PDF
- Add Page Number To PDF
- Add Text Stamp To PDF
- AI-Invoice Parser
- AI-Process HealthCard
- AI-Process Contract
- Generate Barcode
- Classify Document
- Parse Document
- Linearize PDF
- Flatten PDF
- Convert To PDF
- Json To Excel
- Convert PDF To Excel
- Convert PDF To Word
- Convert PDF To PowerPoint
- Convert VISIO
- Crop Image
- Delete Blank Pages From PDF
- Delete Unwanted Pages From PDF
- Extract Pages
- Merge Multiple PDFs
- Overlay PDFs
- Rotate Document
- Rotate Page
- Sign PDF
- URL to PDF
- Add Image Watermark To Image
- Add Text Watermark To Image
- Compress Image
- Convert Image Format
- Create Images From PDF
- Flip Image
- Get Image Metadata
- Image Extract Text
- Remove EXIF Tags From Image
- Replace Text With Image
- Replace Text With Image In Word
- Resize Image
- Rotate Image
- Rotate Image By EXIF Data
- Compress PDF
- Get PDF Metadata
- Repair PDF Document
- Get Document From Pdf4me
- Update Hyperlinks Annotation
- Protect Document
- Unlock PDF
- Disable Tracking Changes In Word
- Enable Tracking Changes In Word
- Generate Document Single
- Generate Documents Multiple
- Get Tracking Changes In Word
- Read Barcode From Image
- Read Barcode From PDF
- Read SwissQR Code
- Extract Form Data From PDF
- Extract Pages From PDF
- Extract Attachment From PDF
- Extract Text By Expression
- Extract Table From PDF
- Extract Resources
Overview
The node provides functionality to extract specific pages from a PDF document. It supports multiple input methods for the source PDF, including binary data from a previous node, a base64 encoded string, or a URL pointing to the PDF file. Users specify which pages to extract by providing page numbers or ranges. The output is a new PDF document containing only the extracted pages.
This node is beneficial in scenarios where you need to isolate certain pages from large PDF files for further processing, sharing, or archiving. For example:
- Extracting invoice pages from a multi-page PDF report.
- Splitting a contract into separate sections.
- Creating a summary document with selected pages from a larger manual.
Properties
| Name | Meaning |
|---|---|
| Input Data Type | Choose how to provide the PDF file to extract pages from. Options: • Binary Data (from previous node) • Base64 String (provide PDF content as base64 encoded string) • URL (provide URL to PDF file) |
| Input Binary Field | Name of the binary property that contains the PDF file when using Binary Data input type. Usually "data" for file uploads. |
| Base64 PDF Content | Base64 encoded PDF document content. Used when Input Data Type is set to Base64 String. |
| PDF URL | URL to the PDF file to extract pages from. Used when Input Data Type is set to URL. |
| Document Name | Name of the output PDF document after extraction. Defaults to "output.pdf". |
| Page Numbers | Page numbers to extract from the PDF. Supports single pages (e.g., "1"), multiple pages separated by commas (e.g., "1,3,5"), and ranges (e.g., "2-4"). |
| Advanced Options | Collection of additional options. Currently supports: • Custom Profiles: JSON string to adjust custom properties for API calls, allowing advanced configuration based on external API documentation. |
Output
The node outputs a JSON object containing the extracted PDF document. The main output field json includes metadata and references to the resulting PDF file. The actual PDF content is provided as binary data attached to the output under the specified binary property name.
If the node processes multiple items, it returns an array of such outputs corresponding to each input item.
Dependencies
- Requires access to an external PDF processing API service capable of extracting pages from PDFs.
- Needs proper authentication credentials (such as an API key) configured in n8n to interact with the external PDF service.
- Internet access is required if the input PDF is provided via URL or if the API service is cloud-based.
Troubleshooting
Common Issues:
- Incorrect page number format can cause errors. Ensure page numbers are valid and within the range of the source PDF.
- Providing an invalid or inaccessible URL will result in failure to fetch the PDF.
- Missing or incorrect binary property name when using binary input may lead to no data being found.
- Invalid base64 string input will cause decoding errors.
Error Messages:
- "Page numbers out of range": Check that the requested pages exist in the source PDF.
- "Failed to fetch PDF from URL": Verify the URL is correct and accessible.
- "No binary data found in property" : Confirm the binary property name matches the input data.
- "Invalid base64 content": Validate the base64 string format.
Resolving these typically involves verifying input parameters and ensuring the source PDF is correctly provided.
Links and References
- PDF4me API Documentation — For details on custom profiles and advanced options.
- Base64 Encoding Reference — Understanding base64 encoded data.
- PDF Page Numbering — Official PDF specification for page numbering conventions.