CloudConvert
Use CloudConvert to convert files, create thumbnails, merge files, add watermarks and more!
Actions51
- File Actions
- Metadata Actions
- PDF Actions
- Add PDF OCR Layer
- Decrypt PDF
- Merge Files
- Decrypt PDF
- Merge Files
- Split PDF Into Pages
- Convert PDF to PDF/A
- Extract Pages From PDF
- Split PDF Into Pages
- Convert PDF to PDF/A
- Encrypt PDF
- Extract Pages From PDF
- Rotate PDF Pages
- Split PDF Into Pages
- Add PDF OCR Layer
- Convert PDF to PDF/A
- Encrypt PDF
- Extract Pages From PDF
- Rotate PDF Pages
- Add PDF OCR Layer
- Decrypt PDF
- Encrypt PDF
- Merge Files
- Rotate PDF Pages
- Command Actions
- Website Actions
Overview
This node integrates with CloudConvert to add an OCR (Optical Character Recognition) text layer to scanned PDF files. It is useful for converting scanned PDFs, which are essentially images, into searchable and selectable text documents. This operation is beneficial in scenarios where users need to extract text from scanned documents for editing, searching, or archiving purposes. For example, a user can upload a scanned PDF and automatically add an OCR layer to make the text within the document accessible and searchable.
Use Case Examples
- Adding an OCR layer to scanned PDFs to enable text search and selection.
- Automating the processing of scanned documents to convert them into searchable PDFs for digital archiving.
Properties
| Name | Meaning |
|---|---|
| Authentication | Method of authenticating with CloudConvert API, either OAuth2 or API Key. |
| Binary Input Data | Whether the input file to upload should be taken from a binary field. |
| Input File Contents | The text content of the file to upload, used if Binary Input Data is false. |
| Binary Property | Name of the binary property containing the file data to be converted, used if Binary Input Data is true. |
| Auto Orient | Whether to automatically detect and correct page orientation before performing OCR. |
| Languages | Comma-separated list of language codes to use for OCR, e.g., 'eng,deu'. |
Output
JSON
data- The resulting PDF file with the added OCR text layer.
Dependencies
- CloudConvert API
Troubleshooting
- Ensure the input PDF is a scanned document suitable for OCR; otherwise, the OCR layer may not be added correctly.
- Verify that the correct authentication method and credentials are provided to avoid authorization errors.
- If the OCR languages are not specified correctly, the OCR process may fail or produce inaccurate results.
Links
- CloudConvert API Documentation - Official API documentation for CloudConvert, including OCR operations.