Aparavi PII Censor icon

Aparavi PII Censor

Censor PII using Aparavi DTC with flexible input handling

Overview

The Aparavi PII Censor node is designed to detect and censor Personally Identifiable Information (PII) in input data using the Aparavi Data Transformation Cloud (DTC) service. It supports multiple PII types including USA-specific PII (e.g., Social Security Numbers, driver licenses), international PII (e.g., passports, phone numbers), and healthcare data regulated under HIPAA.

This node is useful in scenarios where sensitive information must be redacted or anonymized before further processing, sharing, or storage. For example, it can be used to automatically mask PII in customer records, medical documents, or international user data to comply with privacy regulations.

The node offers flexible input handling, allowing users to process all fields, specific fields, or let the node auto-detect the input type. It also provides options to preserve the original data structure and include metadata about the PII detection process.

Properties

Name Meaning
PII Type Selects the category of PII to detect and censor:
- USA PII: Detects USA-specific identifiers like SSN and driver license.
- International PII: Detects international identifiers such as passports and phone numbers.
- Healthcare Data (HIPAA): Detects healthcare-related PII under HIPAA regulations.
Input Data Defines how input data is processed:
- Auto-detect: Automatically detects input type.
- All Fields: Processes all fields in objects/arrays.
- Specific Fields: Processes only specified fields.
Fields to Process Comma-separated list of field names to process when "Specific Fields" is selected. If left empty, all fields are processed.
Censor Character The character used to replace detected PII for censoring purposes. Default is a solid block character (█).
Options Collection of additional options:
- Preserve Structure (boolean): Whether to keep the original data structure intact, placing censored data separately.
- Include Metadata (boolean): Whether to add metadata about the PII detection (type, timestamp, original keys) to the output.
- Batch Size (number): Number of items to process per batch when input is an array (default 10).

Output

The node outputs an array of items where each item contains a json object representing the censored data.

  • If Preserve Structure is enabled, the censored PII data is placed inside a separate property (censored) while keeping the original data intact.
  • Otherwise, the original fields containing PII are replaced directly with their censored versions.
  • If Include Metadata is enabled, an additional _metadata field is added to each output item containing:
    • piiType: The selected PII type used for censoring.
    • processedAt: ISO timestamp of when the item was processed.
    • originalKeys: List of keys present in the original input item.

The node does not output binary data.

Example output JSON structure (simplified):

{
  "field1": "██████",
  "field2": "some value",
  "_metadata": {
    "piiType": "usa",
    "processedAt": "2024-06-01T12:00:00.000Z",
    "originalKeys": ["field1", "field2"]
  }
}

Dependencies

  • Requires an active API key credential for the Aparavi Data Transformation Cloud service.
  • The node depends on the external Aparavi PII censoring library which interacts with the Aparavi API.
  • Proper configuration of the API key credential within n8n is necessary for authentication.

Troubleshooting

  • Missing API Key Error: If the node throws an error indicating missing credentials, ensure that the required API key credential is configured correctly in n8n.
  • Unknown PII Type Error: Selecting an unsupported PII type will cause an error. Verify that the PII Type property is set to one of the supported options: USA, International, or HIPAA.
  • Field Processing Issues: When using "Specific Fields," if no fields are specified or the field names do not exist in the input data, the node may produce empty or incomplete results. Double-check the comma-separated field list.
  • Large Data Sets: For large arrays, adjust the Batch Size option to optimize performance and avoid timeouts.
  • Continue On Fail Behavior: If enabled, the node will continue processing subsequent items even if some fail, returning error details in the output. Otherwise, it stops execution on the first failure.

Links and References

Discussion