h2oGPTe icon

h2oGPTe

h2oGPTe is an AI-powered search assistant for your internal teams to answer questions gleaned from large volumes of documents, websites and workplace content.

Actions198

Overview

This node operation fetches the OCR (Optical Character Recognition) model information used for a specific page of a document. It is useful when you want to retrieve details about the OCR model applied to a document's pages, such as which OCR engine or configuration was used to extract text from images within that document.

Typical use cases include:

  • Verifying or auditing the OCR model used on scanned documents.
  • Integrating with workflows that require knowledge of the OCR processing method.
  • Debugging or enhancing document processing pipelines by understanding OCR configurations.

For example, if you have a scanned PDF document ingested into your system and want to know which OCR model was applied to its pages, this operation will return that information.

Properties

Name Meaning
Document ID The unique identifier of the document whose page OCR model you want to fetch.

Output

The output JSON contains the details of the OCR model associated with the specified document's pages. This typically includes metadata about the OCR engine, settings, or model version used for text extraction on the document pages.

If the node supports binary data output (not indicated here), it would represent related binary content such as images or processed files, but in this case, the output is purely JSON metadata about the OCR model.

Dependencies

  • Requires an API key credential for authentication to the external service hosting the document and OCR models.
  • The node makes an HTTP GET request to an endpoint structured as /documents/{document_id}/page_ocr_model.
  • The base URL and authentication headers must be configured properly in the node credentials.

Troubleshooting

  • Invalid Document ID: If the provided Document ID does not exist or is incorrect, the API will likely return a 404 error. Verify the Document ID is correct.
  • Authentication Errors: Missing or invalid API key credentials will cause authorization failures. Ensure valid credentials are set up.
  • Network Issues: Connectivity problems to the API endpoint can cause timeouts or connection errors. Check network access and endpoint availability.
  • Empty or Unexpected Response: If the document has no OCR model applied or the data is missing, the response may be empty or incomplete. Confirm the document has been processed with OCR.

Links and References

  • Refer to the API documentation of the document management or OCR service for detailed schema of the OCR model response.
  • n8n documentation on HTTP Request nodes and credential setup for API integrations.

Discussion