LlamaParse icon

LlamaParse

Parse PDF files and get their content in markdown!

Overview

This node, named LlamaParse, is designed to parse PDF files and extract their content in markdown format. It is particularly useful when you need to convert PDF documents into a more editable or processable text format such as markdown for further automation, analysis, or integration with other tools.

Common scenarios:

  • Extracting text content from PDF reports for automated processing.
  • Converting PDF manuals or documentation into markdown for publishing or editing.
  • Automating data extraction workflows where PDFs are input files.

Practical example:
You have a folder of PDF invoices and want to extract the textual content to feed into a database or generate summaries. This node can parse each PDF file and output its content as markdown, which can then be processed downstream.

Properties

Name Meaning
File Path The full path to the PDF file you want to parse. Example: /User/user/Desktop/file.pdf

Output

The node outputs an array of JSON objects, each representing parsed content from the PDF file in markdown format. Each item corresponds to a segment or page of the PDF converted into markdown text.

  • The json output field contains these markdown representations.
  • No binary data output is produced by this node.

Dependencies

  • Requires an external service accessible via an API key credential (referred to generically here as "an API key credential").
  • Uses the llamaindex library internally to perform the parsing.
  • The node expects the user to provide a valid API key credential for authentication with the external parsing service.
  • The file must be accessible at the specified path on the system where n8n runs.

Troubleshooting

  • File not found or inaccessible: Ensure the file path is correct and that the n8n process has permission to read the file.
  • Invalid or missing API key: The node requires a valid API key credential. Verify that the credential is set up correctly in n8n.
  • Parsing errors: If the PDF is corrupted or in an unsupported format, the parsing may fail. Try opening the PDF manually to confirm it is valid.
  • Empty output: If the PDF contains no extractable text or is image-based only, the output may be empty or minimal.

Links and References

Discussion