AIConnect icon

AIConnect

Use OpenAI-compatible API functions

Overview

The node "AIConnect" provides integration with OpenAI-compatible API functions, supporting multiple resources including Audio. For the Audio resource with the Create Transcription operation, the node transcribes audio files into text using a specified AI model.

This is useful for scenarios such as:

  • Converting recorded meetings, interviews, or podcasts into searchable text.
  • Generating subtitles or captions for videos.
  • Automating transcription workflows in content production or customer support.

For example, you can upload an MP3 recording of a conference call and receive a text transcript that can be further processed or stored.

Properties

Name Meaning
Model The AI model to use for transcription (loaded dynamically).
File The audio file to transcribe. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm.
Language Optional ISO-639-1 code specifying the language spoken in the audio.
Prompt Optional guiding text to influence the transcription style or continue from a previous segment.
Response Format Format of the transcription output. Options: JSON, Text, SRT (subtitle), Verbose JSON, VTT (subtitle).
Temperature Sampling temperature controlling randomness in transcription (0 to 1).
Timestamp Granularities When using verbose JSON response format, choose timestamp detail level: Word or Segment.
Additional Options Collection of extra options; currently supports a "User" identifier string representing the end-user.
Simplify Output Whether to return a simplified version of the transcription response instead of raw data (default: true).

Output

The node outputs an array of items where each item contains a json field with the transcription result.

  • If Simplify Output is enabled, the output will be a streamlined transcription text or structured object depending on the response format.
  • If disabled, the output includes the full raw response from the transcription API, which may contain detailed metadata, timestamps, confidence scores, etc.
  • When subtitle formats like SRT or VTT are selected, the output contains the transcription formatted accordingly.
  • Binary data output is not indicated for this operation.

Dependencies

  • Requires an active connection to an OpenAI-compatible API endpoint supporting audio transcription.
  • Needs an API authentication token configured in n8n credentials (referred generically as an API key credential).
  • The node dynamically loads available audio models via an internal method.
  • Supported audio file formats must be provided as input.

Troubleshooting

  • Common issues:

    • Unsupported audio file format or corrupted file may cause transcription failure.
    • Missing or invalid API authentication token will prevent successful API calls.
    • Specifying an unsupported language code might lead to inaccurate or failed transcription.
    • Choosing incompatible response formats with certain options (e.g., timestamp granularities without verbose JSON) may cause errors.
  • Error messages:

    • "The resource \"audio\" is not supported!" — indicates misconfiguration of the resource parameter.
    • API errors related to authentication or quota limits should be resolved by verifying credentials and usage limits.
    • Errors about missing required parameters (like model or file) require ensuring all mandatory inputs are set.

Links and References

Discussion