Overview
This node extracts form field information from a PDF document provided as binary data. It reads the PDF, identifies all interactive form fields (such as text inputs, checkboxes, radio buttons), and outputs their names and types. This is useful when you need to analyze or process PDF forms programmatically, for example, to dynamically generate UI elements based on the form structure or validate expected fields before filling them.
Common scenarios include:
- Extracting metadata about PDF forms in document automation workflows.
- Preparing data mappings for automated PDF form filling.
- Validating that uploaded PDFs contain required form fields.
Properties
| Name | Meaning |
|---|---|
| Property Name | Name of the binary property holding the PDF document to analyze. Default is "data". |
| Max PDF Size | Maximum allowed size of the PDF file in megabytes (MB). Default is 10. |
Output
The node outputs an array of JSON objects, each corresponding to one input item. Each output object contains:
totalFields: Number of form fields detected in the PDF.fields: An array of objects describing each form field with:key: The name of the form field.type: The type of the form field (e.g., textfield, checkbox, radiobutton), derived from the PDF field class name.
If an error occurs for an item and "Continue On Fail" is enabled, the output for that item will contain an error property with the error message.
Example output JSON snippet:
{
"totalFields": 3,
"fields": [
{ "key": "firstName", "type": "textfield" },
{ "key": "subscribeNewsletter", "type": "checkbox" },
{ "key": "gender", "type": "radiobutton" }
]
}
Dependencies
- Uses the
pdf-liblibrary to load and parse PDF documents and extract form fields. - Requires the PDF document to be provided as binary data in the specified binary property.
- No external API keys or services are needed.
- The node expects the input binary data to be a valid PDF file and enforces a maximum file size limit configurable via the node properties.
Troubleshooting
- No binary data found: If the specified binary property does not exist or is empty, the node throws an error. Ensure the correct binary property name is set and that the input contains the PDF file.
- Invalid file type: If the binary data is not a PDF (based on MIME type), an error is thrown. Verify the input file is a valid PDF.
- File size exceeded: If the PDF exceeds the configured maximum size, the node will error out. Increase the "Max PDF Size" property if necessary.
- Malformed PDF or unsupported features: If the PDF cannot be parsed by
pdf-lib, the node may throw errors. Check the PDF integrity and compatibility. - Enable "Continue On Fail" to allow processing multiple items even if some fail.
Links and References
- pdf-lib GitHub repository – Library used for PDF parsing and form field extraction.
- n8n Documentation on Binary Data – Understanding how to work with binary properties in n8n nodes.