document-to-text

Converts documents to text using Azure Open AI

Package Information

Downloads: 21 weekly / 35 monthly

Latest Version: 0.1.4

Author: Matthias Leitner

Available Nodes

Document To Text

Converts documents to text using Azure Open AI

Documentation

n8n-nodes-document-to-text

n8n is a fair-code licensed workflow automation platform.

Installation
Operations
Credentials
Compatibility
Usage
Development
Publishing
Resources

Installation

Follow the installation guide in the n8n community nodes documentation. After publishing/installation, restart n8n so it picks up the node from your installed package.

Operations

This package provides a single node: Document To Text.

Converts PDF documents to plain text using Azure OpenAI vision-capable chat models.
Renders each PDF page to a PNG image and sends it to your Azure OpenAI deployment via the Chat Completions API.
Merges the per‑page responses in order into a single output string.

Inputs

Document (required): Base64 string of the PDF to convert. In n8n, reference the incoming binary data with an expression like {{ $binary.myPdf.data }}.

Parameters

Model (Deployment) Name (required): The Azure OpenAI deployment name for your vision‑enabled model (for example, gpt-4o, gpt-4o-mini).
System Prompt (required): Prompt used to instruct the model. A sensible default is provided to extract text without summarizing.
Scale (Render Zoom): PDF render scale (affects image resolution and token usage). Default 1.6.
Temperature: Sampling temperature for the model. Default 0.2.
Max Parallel Requests: Number of per‑page requests to issue concurrently. Default 1 (increase cautiously to avoid rate limits).

Output

A single item per input with json.output (the extracted text) and json.pages (number of PDF pages processed).

Notes

Supported input format: PDF only.
Each PDF page results in one Chat Completions request; costs scale with page count and render Scale.
Built‑in retry logic handles transient HTTP errors (429/5xx) with exponential backoff.

Credentials

Use the built‑in n8n credential type Azure OpenAI API (azureOpenAiApi).

Prerequisites

An Azure OpenAI resource with a deployed, vision‑capable chat model (for example, gpt-4o, gpt-4o-mini).
Your resource Endpoint URL (e.g., https://<your‑resource>.openai.azure.com/).
An API Key for the resource.
API version supporting image inputs (default used by the node: 2024-02-15-preview, or newer).

Set up

In n8n, create credentials of type Azure OpenAI API.
Enter the Endpoint, API Key, and (optionally) API Version.
In the node, select these credentials and specify your Model (Deployment) Name exactly as it’s named in Azure.

Compatibility

Node.js: >= 20.15 (see engines in package.json).
n8n: Community nodes must be enabled. Uses the n8n Nodes API v1.
Platforms: Works on common Node.js platforms without a headless browser. Rendering uses pdfjs-dist with @napi-rs/canvas.
Azure model requirement: A vision‑enabled chat model deployment (e.g., gpt-4o, gpt-4o-mini).

Usage

Basic flow

Obtain a PDF in binary form in n8n (e.g., via HTTP Request, Webhook, Google Drive, S3, etc.). The file should appear under $binary on the incoming item.
Add the Document To Text node and connect it.
In the Document field, use an expression to reference the PDF binary, for example:
- If your binary property is myPdf: {{ $binary.myPdf.data }}
- Adjust the property name to match your workflow.
Select your Azure OpenAI API credentials and set Model (Deployment) Name to your Azure deployment (e.g., gpt-4o).
Optionally adjust Scale, Temperature, and Max Parallel Requests.
Execute the workflow. The node outputs a JSON object with:
- output: the extracted text for the whole document
- pages: the number of pages processed

Tips

Start with Max Parallel Requests = 1 to avoid 429 rate limits; increase gradually if your quota allows.
Higher Scale improves text/image detail but increases token usage and cost.
For very large PDFs, consider splitting or pre‑processing to control cost and execution time.

Development

n8n doesn’t hot‑reload community nodes. To test locally:

Build the node
```
npm run build
```

Link into n8n’s custom folder (one‑time)

npm link // execute in the root of the repository
mkdir -p ~/.n8n/custom
cd ~/.n8n/custom
npm link n8n-nodes-document-to-text

Start n8n locally (with debug logs enabled)

N8N_LOG_LEVEL=debug N8N_LOG_PRETTY=true N8N_RUNNERS_ENABLED=true npx -y n8n

Develop
- After code changes, run npm run build again.
- Restart n8n to load the new build.