PDF4me icon

PDF4me

Comprehensive PDF and document processing: generate barcodes, convert files, extract data, manipulate images, and automate workflows with the PDF4ME API

Actions80

Overview

The node extracts form data from a PDF document. It is useful when you need to programmatically retrieve filled-in form values from PDFs, such as surveys, applications, or contracts that contain interactive form fields. This can automate data collection workflows by converting PDF form inputs into structured JSON data for further processing.

Practical examples:

  • Extracting user-submitted data from PDF forms uploaded via a web portal.
  • Processing scanned application forms converted to PDFs with fillable fields.
  • Automating extraction of invoice or contract details embedded in PDF forms.

Properties

Name Meaning
Input Data Type Choose how to provide the PDF file:
- Binary Data (from previous node)
- Base64 String
- URL to PDF file
Input Binary Field Name of the binary property containing the PDF file (used if Input Data Type is Binary Data). Usually "data".
Base64 PDF Content Base64 encoded string of the PDF content (used if Input Data Type is Base64 String).
PDF URL URL pointing to the PDF file (used if Input Data Type is URL).
Document Name Name assigned to the document during processing (default: "document.pdf").
Advanced Options Optional JSON string to specify custom profiles and extra API options for advanced control over extraction.

Output

The output contains a JSON object representing the extracted form data from the PDF. This typically includes key-value pairs where keys are form field names and values are the corresponding filled-in data.

If the PDF contains multiple form fields, the output JSON will reflect all detected fields and their values.

No binary output is produced by this operation.

Dependencies

  • Requires access to an external PDF processing API service capable of extracting form data from PDFs.
  • The node expects proper authentication credentials (e.g., an API key) configured in n8n to communicate with the PDF processing service.
  • Internet access is needed if providing the PDF via URL or if the API is cloud-based.

Troubleshooting

  • Common issues:

    • Providing incorrect input data type or mismatched input fields (e.g., specifying binary data but no binary property named accordingly).
    • Invalid or inaccessible PDF URL leading to download failures.
    • Malformed base64 string causing decoding errors.
    • PDFs without any form fields will result in empty or minimal output.
    • API authentication failures due to missing or invalid credentials.
  • Error messages and resolutions:

    • "Failed to fetch PDF from URL": Check the URL is correct, accessible, and publicly reachable.
    • "Invalid base64 content": Verify the base64 string is complete and correctly encoded.
    • "No form fields found": Confirm the PDF actually contains interactive form fields.
    • "Authentication error": Ensure API credentials are set up properly in n8n.
    • "Binary property not found": Make sure the binary property name matches the actual binary data field from the previous node.

Links and References

Discussion