Actions19
- AI Invoice Parser
- URL/HTML to PDF
- Merge PDF
- Split PDF
- Convert To PDF
- Convert From PDF
- Add Text/Images to PDF
- Fill a PDF Form
- PDF Information & Form Fields
- Compress PDF
- PDF Security
- Rotate PDF Pages
- Delete PDF Pages
- Search in PDF
- Search & Replace Text or Delete
- Barcode Reader
- Barcode Generator
- Make PDF Searchable or Unsearchable
- Upload File
Overview
This node enables searching for specific text within a PDF document accessible via a URL. It is useful when you need to extract or locate information inside PDFs without manually opening them, such as scanning contracts for keywords, verifying the presence of certain terms in reports, or automating data extraction workflows.
For example, you can provide a URL to a PDF invoice and search for the company name or invoice number. The node supports advanced options like using regular expressions for complex search patterns, restricting the search to specific pages, and handling password-protected PDFs.
Properties
| Name | Meaning |
|---|---|
| PDF URL | The URL of the PDF file to search. |
| Search Query | The text string or pattern you want to find within the PDF document. |
| Use Regular Expressions | Whether to interpret the search query as a regular expression for more flexible and complex matching. |
| Pages | Comma-separated list of page numbers to limit the search to specific pages. Leave empty to search all pages. |
| File Name | (Advanced) The desired name for the output file generated by the search operation. |
| Webhook URL | (Advanced) A callback URL or webhook endpoint where the output data will be sent asynchronously. |
| Output Links Expiration (In Minutes) | (Advanced) Duration in minutes before the output link expires. Defaults to 60 minutes. |
| Inline | (Advanced) Whether to return the output directly in the response (true) or only provide a link to download it (false). |
| Word Matching Mode | (Advanced) Defines how words are matched: Smart Match (default, intelligent matching), Exact Match (strict matching), or None (no special word matching). |
| Password | (Advanced) Password for accessing password-protected PDF files. |
| HTTP Username | (Advanced) Username for HTTP authentication if required to access the PDF URL. |
| HTTP Password | (Advanced) Password for HTTP authentication if required to access the PDF URL. |
| Custom Profiles | (Advanced) JSON string to specify custom API call options or profiles for fine-tuning behavior. See the external API documentation for available profile settings. |
Output
The node outputs a JSON object containing the results of the search operation. This typically includes details about the found matches such as page numbers, positions, and the matched text snippets. If configured to return inline, the output data is included directly; otherwise, a downloadable link to the result file is provided.
If binary data is involved (e.g., an output PDF with highlights), it would be returned accordingly, but this node primarily focuses on JSON search results.
Dependencies
- Requires access to the PDF file via a publicly accessible URL or one accessible with provided HTTP credentials.
- Uses an external PDF processing API service to perform the search operation.
- May require an API key credential configured in n8n to authenticate requests to the external PDF service.
- Optional webhook URL support for asynchronous callbacks.
Troubleshooting
Common issues:
- Invalid or inaccessible PDF URL: Ensure the URL is correct and reachable from the n8n environment.
- Incorrect HTTP credentials: Verify username and password if the PDF URL requires authentication.
- Password-protected PDFs: Provide the correct password to access encrypted documents.
- Malformed regular expressions: When using regex search, ensure the pattern syntax is valid.
- Page numbers format: Use comma-separated integers without spaces for the pages property.
Error messages:
- "Failed to fetch PDF": Check network connectivity and URL correctness.
- "Authentication failed": Verify HTTP username and password.
- "Invalid password for PDF": Confirm the PDF password is correct.
- "Regex pattern error": Review and correct the regular expression syntax.
Resolving these usually involves verifying input parameters and ensuring proper access rights.
Links and References
- PDF.co API Profiles Documentation — For customizing API call options via the "Custom Profiles" property.
- Regular Expressions Guide — To help construct valid regex patterns for searches.