VocantAI Speech-to-Text icon

VocantAI Speech-to-Text

Upload and download transcription in one node

Overview

This node integrates with the VocantAI Speech-to-Text service to transcribe audio files. It uploads an audio file, waits for the transcription job to complete by polling the service status, and then downloads the resulting transcription text. This all-in-one node simplifies speech-to-text workflows by handling upload, status checking, and download in a single step.

Common scenarios where this node is beneficial include:

  • Transcribing recorded interviews or meetings for documentation.
  • Converting podcasts or lectures into searchable text.
  • Automating subtitle generation for videos.

Practical example: A user can input an audio recording from a binary field, configure how often to check the transcription status, and receive the transcription as a downloadable text file attached to the output.

Properties

Name Meaning
Audio File (binary) The name of the input binary field containing the audio file to process.
Use Original Filename Whether to use the original filename of the audio file when naming the transcription file.
Polling Interval (seconds) How often (in seconds) the node checks the transcription job status.
Max Wait Time (seconds) Maximum time (in seconds) to wait for the transcription job to complete before timing out.

Output

The node outputs an array of items corresponding to each input item processed. Each output item contains:

  • json:

    • success: Boolean indicating if transcription succeeded.
    • jobId: The unique identifier of the transcription job.
    • status: Current or final status of the transcription job (e.g., "pending", "completed").
    • waited: Total time waited (in seconds) for the transcription to complete.
    • error (if any): Error message if transcription failed or timed out.
  • binary (only on success):

    • data: The transcription text file encoded in base64.
    • mimeType: Always "text/plain".
    • fileName: The transcription file name, either based on the original audio filename or a generated name including the job ID.

The binary data represents the full transcription text file downloaded from the service.

Dependencies

  • Requires an API key credential for the VocantAI service.
  • Makes HTTP requests to VocantAI endpoints for presigned URL retrieval, file upload, job status polling, and transcription download.
  • Uses n8n helper methods for HTTP requests and binary data handling.

Troubleshooting

  • No binary data found error: Occurs if the specified binary property does not exist on the input item. Ensure the correct binary field name is provided and that the input contains valid binary audio data.

  • Transcription failed error: If the transcription job fails on the server side, the node throws an error with the failure message returned by VocantAI. Check the audio file format and content, and verify API key validity.

  • Timeout waiting for transcription: If the transcription does not complete within the configured max wait time, the node returns a failure result for that item. Increase the max wait time or check the VocantAI service status.

  • HTTP request errors: Network issues or invalid API keys will cause HTTP request failures. Verify network connectivity and API credentials.

Links and References

Discussion