PDF4me icon

PDF4me

Comprehensive PDF and document processing: generate barcodes, convert files, extract data, manipulate images, and automate workflows with the PDF4ME API

Actions80

Overview

This node operation, Get PDF Metadata, extracts metadata information from a PDF file. It supports multiple input methods for providing the PDF content: as binary data from a previous node, as a base64 encoded string, or via a URL pointing to the PDF file.

Typical use cases include:

  • Automatically retrieving document properties such as author, title, creation date, and other embedded metadata.
  • Integrating PDF metadata extraction into automated workflows for document management, auditing, or indexing.
  • Validating PDF files before further processing by checking their metadata.

For example, you might use this node to extract metadata from invoices received as PDFs in an email workflow or to gather document details from PDFs stored on a web server.

Properties

Name Meaning
Input Data Type Choose how to provide the PDF file:
- Binary Data (from previous node)
- Base64 String
- URL
Input Binary Field Name of the binary property containing the PDF file (used only if Input Data Type is Binary Data)
Base64 PDF Content Base64 encoded string of the PDF content (used only if Input Data Type is Base64 String)
PDF URL URL to the PDF file to extract metadata from (used only if Input Data Type is URL)
Output File Name Filename for the output metadata JSON file (default: pdf_metadata.json)
Async Enable asynchronous processing (boolean flag)

Output

The node outputs a JSON object containing the extracted metadata from the PDF. This typically includes standard PDF metadata fields such as:

  • Title
  • Author
  • Subject
  • Keywords
  • Creator
  • Producer
  • CreationDate
  • ModDate

The exact structure depends on the PDF's embedded metadata but will be presented as JSON.

If the node produces any binary data (e.g., the metadata file), it will be available as a downloadable JSON file named according to the "Output File Name" property.

Dependencies

  • Requires access to the PDF file either as binary data, base64 string, or accessible URL.
  • Depends on an external PDF processing service or library integrated within the node (not explicitly detailed in the source).
  • May require API authentication credentials configured in n8n to interact with the PDF processing backend.

Troubleshooting

  • Common Issues:

    • Providing an incorrect binary field name when using binary data input will cause the node to fail to locate the PDF file.
    • Invalid base64 strings or inaccessible URLs will result in errors during metadata extraction.
    • Network issues or permission restrictions may prevent accessing the PDF URL.
  • Error Messages:

    • Errors related to missing or invalid input data usually indicate misconfiguration of the input properties.
    • Authentication or API errors suggest missing or incorrect API credentials.
    • Timeout or network errors when fetching PDF from URL indicate connectivity problems.
  • Resolutions:

    • Verify that the binary property name matches exactly the name used in the previous node.
    • Ensure base64 content is correctly encoded and complete.
    • Confirm the URL is reachable and publicly accessible or properly authenticated.
    • Check API credentials and permissions in n8n settings.

Links and References

Discussion