PDF → Image (pdf-to-img)

Convert a PDF (binary) to images (one item per page)

Overview

This node converts PDF files provided as binary data into images, one image per page or bundled as an array of data URLs. It is useful when you need to extract visual content from PDFs for further processing, previewing, or integration with other systems that require image formats.

Common scenarios include:

  • Generating thumbnails or previews of PDF documents.
  • Extracting pages as images for use in reports or presentations.
  • Converting password-protected PDFs into images by providing the password.
  • Selecting specific pages to convert rather than the entire document.

For example, a user can input a multi-page PDF and get either individual image items per page or a single item containing all pages as base64-encoded image URLs.

Properties

Name Meaning
PDF Binary Property The name of the binary property in the input item that contains the PDF file data.
Scale The scale factor to apply when rendering the PDF pages to images (e.g., 3 means 3x scaling).
Password Optional password to unlock encrypted PDF files.
Pages (optional) Comma-separated list of page numbers to convert. If empty, all pages are converted.
Output Format The image format for output: PNG or JPEG.
Return How to return the images:
• One Item per Page (binary): each page as a separate binary item.
• Single Item (array of data URLs): all pages bundled as an array of base64 data URLs in one item.

Output

The node outputs items depending on the "Return" property:

  • One Item per Page (binary): Each output item corresponds to one page of the PDF.

    • json field contains:
      • page: the page number.
      • format: the image format used ("png" or "jpeg").
    • binary field contains the image data for that page, named data, with the appropriate MIME type (image/png or image/jpeg).
  • Single Item (array of data URLs): A single output item containing:

    • json.pages: an array of strings, each a data URL representing a page image (e.g., "data:image/png;base64,...").
    • json.pageCount: total number of pages converted.
    • json.format: the image format used ("png" or "jpeg").

No binary data is output in this mode; instead, images are embedded as base64 data URLs inside JSON.

Dependencies

  • Uses the external library pdf-to-img to perform PDF rendering and conversion to images.
  • No credentials or API keys are required.
  • Requires the input PDF to be available as binary data in the specified binary property.
  • The node expects the environment to support Buffer operations and async iteration.

Troubleshooting

  • Error: Binary property "X" not found
    This occurs if the specified binary property does not exist in the input item. Ensure the correct binary property name is set and that the input contains valid PDF binary data.

  • Password issues
    If the PDF is password protected and the wrong or no password is provided, the conversion will fail. Provide the correct password in the "Password" property.

  • Invalid page numbers
    If the "Pages" property contains invalid or out-of-range page numbers, those pages will be ignored. Use valid positive integers separated by commas.

  • Large PDFs or high scale values
    High scale factors or very large PDFs may cause performance issues or high memory usage. Adjust the scale or split the PDF before processing if needed.

Links and References

Discussion