Aparavi DTC icon

Aparavi DTC

Complete Aparavi DTC platform with OCR, parsing, transcription, PII censoring, and custom pipelines

Overview

This node integrates with the Aparavi DTC platform to perform various data processing operations including OCR, document parsing, audio transcription, audio summarization, PII anonymization, advanced parsing, and PII censoring specific to USA, international, or HIPAA healthcare data. It supports processing input from file paths, binary data, or text data and can handle custom pipeline workflows. The node is useful for automating data extraction, transcription, and sensitive data redaction in workflows, such as censoring personal identifiable information (PII) in documents or audio files, extracting text from images, or summarizing audio content.

Use Case Examples

  1. Censor USA-specific PII in customer documents by providing file paths or binary data.
  2. Transcribe audio files to text for further analysis.
  3. Extract structured data from scanned documents using OCR and parsing.
  4. Run a custom Aparavi pipeline for specialized data processing workflows.

Properties

Name Meaning
Input Type Specifies the type of input data to process: file path, binary data, or text data.
File Path Full path to the file to process (shown when Input Type is 'File Path').
Binary Property Name of the binary property containing the file to process (shown when Input Type is 'Binary Data').
Text Data Text data to process (shown when Input Type is 'Text Data').
Input Data Mode Determines whether to process all fields or only specific fields in the input data.
Fields to Process Comma-separated list of field names to process when Input Data Mode is 'Specific Fields'.
Censor Character Character used to censor detected PII in the output.
Options Additional options including custom API base URL, timeout in seconds, and retry attempts on failure.

Output

JSON

  • json - Contains the processed results from the Aparavi platform, including censored data, parsed text, transcriptions, or pipeline execution status.

Dependencies

  • Requires an Aparavi API key credential for authentication to the Aparavi DTC platform.

Troubleshooting

  • Ensure valid Aparavi API credentials are provided; missing or invalid keys will cause authentication errors.
  • File input operations require the file to exist at the specified path; invalid paths will cause failures.
  • Binary input requires the binary property to exist and contain valid data; otherwise, errors will occur.
  • Text input is only supported for certain operations like PII anonymization; using text input for unsupported operations will throw errors.
  • Invalid JSON format for custom pipeline configuration will cause parsing errors; ensure JSON is well-formed.
  • Connection errors to the Aparavi service may occur; the node retries with exponential backoff but persistent network issues will cause failures.

Discussion