Aparavi DTC icon

Aparavi DTC

Complete Aparavi DTC platform with OCR, parsing, transcription, PII censoring, and custom pipelines

Overview

This node integrates with the Aparavi DTC platform to perform advanced document processing tasks such as OCR, parsing, audio transcription, audio summarization, PII anonymization, and custom pipeline execution. It supports processing input from file paths, binary data, or text data, making it versatile for various document and audio processing workflows. Practical applications include extracting text from scanned documents, transcribing audio files, anonymizing sensitive personal information, and running custom data processing pipelines.

Use Case Examples

  1. Extract text from a PDF file using OCR.
  2. Transcribe audio recordings to text.
  3. Anonymize personally identifiable information (PII) in text data according to US, international, or HIPAA standards.
  4. Run a custom Aparavi pipeline defined in JSON format for specialized data workflows.

Properties

Name Meaning
Input Type Specifies the type of input data to process: a file path, binary data from a previous node, or raw text data.
File Path Full path to the file to process. Used when Input Type is 'File Path'.
Binary Property Name of the binary property containing the file to process. Used when Input Type is 'Binary Data'.
Text Data Text data to process. Used when Input Type is 'Text Data'.
Options Additional options for the operation such as custom API base URL, timeout duration, and retry attempts on failure.

Output

JSON

  • json
    • error - Error message if the operation fails and continueOnFail is enabled.

Dependencies

  • Requires an API key credential for the Aparavi platform.
  • Uses the 'aparavi-client' library to communicate with the Aparavi API.
  • Node.js built-in modules such as 'fs', 'path', and 'os' for file handling.

Troubleshooting

  • Ensure that the Aparavi API credentials are correctly configured; missing API key will cause the node to throw an error.
  • For file or binary input types, verify that the file path or binary property name is correct and accessible.
  • Text input is only supported for certain operations like Anonymize PII; using text input for unsupported operations will cause errors.
  • Invalid JSON format in custom pipeline configuration will cause the node to throw an error.
  • Connection errors to the Aparavi WebSocket API are retried automatically, but persistent connection issues may require checking network settings or API availability.

Links

Discussion