pdf-ocr-parse

Extract text from scanned PDF documents using OCR powered by PDF API Hub

Downloads: 3 weekly / 67 monthly

Latest Version: 1.0.1

Author: Rishabh Dugar

n8n-nodes-pdf-ocr-parse

Extract text from scanned PDF documents using OCR — supports multiple languages and advanced tuning.

This is an n8n community node powered by PDF API Hub.

Parameter	Description
Input Type	URL or Binary file
Pages	`all` or specific ranges like `1-3,5`
Language	English, Portuguese, Russian — or combine with `+` (e.g. `eng+por`)
Detail Level	Text (plain text) or Words (with bounding box coordinates)
Output Format	JSON or plain Text

Option	Description
DPI	Resolution for OCR processing (72–400)
Character Whitelist	Restrict to specific characters (e.g. `0123456789`)
PSM	Page segmentation mode (auto, single block, single line, etc.)
OEM	OCR engine mode (legacy, LSTM, or combined)