PDF.co Api icon

PDF.co Api

Generate PDF, extract data from PDF, split PDF, merge PDF, convert PDF. Fill PDF forms, add text and images to pdf and much more with pdf.co!

Overview

This node is designed to retrieve information about a PDF file from a given URL. It can extract general metadata and optionally extract fillable form fields within the PDF. This functionality is useful for workflows that need to analyze or process PDFs dynamically, such as extracting form data for automation, verifying document properties, or preparing PDFs for further processing.

Practical examples include:

  • Automatically extracting form field names and types from an invoice PDF to populate a database.
  • Retrieving PDF metadata like page count or author before deciding on further processing steps.
  • Using the extracted information to validate PDF contents in document management workflows.

Properties

Name Meaning
Url The URL of the PDF file to get information about. This must be a publicly accessible or authenticated URL pointing to the PDF document.
Extract Fillable Fields A boolean option to specify whether to extract fillable form fields from the PDF. If enabled, the node will return details about interactive fields present in the document.
Advanced Options A collection of optional settings to customize the request:
- Webhook URL: URL to receive output data asynchronously.
- Output Links Expiration (In Minutes): How long the output link remains valid.
- Password: Password for encrypted PDFs.
- HTTP Username: Username for HTTP authentication if required to access the PDF URL.
- HTTP Password: Password for HTTP authentication.
- Custom Profiles: JSON string to set extra API call options, e.g., output data format.

Output

The node outputs a JSON object containing information about the PDF file. This includes metadata such as number of pages, file size, and other document properties. If the "Extract Fillable Fields" option is enabled, the output also contains detailed information about each fillable form field found in the PDF, including field names, types, and possibly default values.

No binary data output is indicated by the source code or properties.

Dependencies

  • Requires access to an external PDF processing API service capable of analyzing PDF files via URL.
  • The node expects an API key credential or similar authentication token configured in n8n to authorize requests to this external service.
  • Network access to the provided PDF URL is necessary; if the URL requires HTTP authentication, credentials must be supplied.
  • Optional webhook URL support for asynchronous callback handling.

Troubleshooting

  • Common Issues:

    • Invalid or inaccessible PDF URL: Ensure the URL is correct and publicly accessible or properly authenticated.
    • Incorrect password for encrypted PDFs: Provide the correct password in advanced options.
    • HTTP authentication failures: Verify username and password if the PDF URL requires HTTP auth.
    • API authentication errors: Confirm that the API key or token is correctly configured in n8n credentials.
    • Timeout or network errors: Check network connectivity and endpoint availability.
  • Error Messages:

    • "Unauthorized" or "Authentication failed": Indicates invalid or missing API credentials.
    • "File not found" or "404 error": The PDF URL is incorrect or inaccessible.
    • "Invalid password": The provided PDF password is wrong.
    • "Timeout" or "Network error": Connectivity issues with the PDF URL or API service.

Resolving these typically involves verifying URLs, credentials, and network conditions.

Links and References

Discussion