Overview
This n8n node, Read PDF Form data, reads a PDF file from a specified binary property and extracts its form data and metadata. It is useful for workflows that need to process PDF forms—such as extracting filled-in values from digital forms, automating document processing, or archiving form responses.
Practical examples:
- Extracting user-submitted data from PDF application forms.
- Automating the collection of survey results stored in PDF format.
- Integrating with document management systems to index PDF form content.
Properties
| Name | Type | Meaning |
|---|---|---|
| Binary Property | String | Name of the binary property from which to read the PDF file. This should match the property name where the PDF file is attached in the input item. |
Output
The node outputs an object in the json field with the following structure:
{
"numpages": <number>, // Total number of pages in the PDF
"numrender": <number>, // Number of pages rendered (always 0 in this implementation)
"info": { ... }, // General PDF information (e.g., title, author, etc.)
"metadata": { ... }, // Additional metadata if available
"text": "<string>", // Concatenated text content from the PDF (empty in this implementation)
"version": "<string>", // Version of the pdfjs library used
"formData": { ... } // Extracted form fields and their values, if present
}
- If an error occurs and "Continue On Fail" is enabled, the output will be:
{ "error": "<error message>" }
Binary Data:
The node passes through the original binary data unchanged in the output's binary property.
Dependencies
- External Library: Uses
pdfjs-distfor PDF parsing. - No external API keys required.
- The PDF file must be provided as a binary property on the input item.
Troubleshooting
Common Issues:
- Missing Binary Data: If the specified binary property does not exist, the node will throw an error indicating it cannot find the PDF file.
- Corrupted or Unsupported PDF: If the PDF cannot be parsed, an error will be thrown.
- Form Data Not Present: If the PDF does not contain form fields, the
formDataobject will be empty.
Error Messages:
"Cannot find binary data": Ensure the input item contains the correct binary property."Failed to parse PDF": Check that the uploaded file is a valid PDF and not corrupted.
How to resolve:
- Double-check the property name in "Binary Property".
- Verify the input file is a valid, non-corrupted PDF.
- Make sure the PDF actually contains form fields if you expect
formDatato be populated.