ElevenLabs icon

ElevenLabs

Interact with ElevenLabs API

Overview

The node integrates with the ElevenLabs API to perform speech-to-text transcription on audio or video files. It converts spoken content within media files into written text, optionally identifying different speakers if diarization is enabled. This node is useful for automating transcription tasks such as generating subtitles, creating searchable archives of audio/video content, or analyzing conversations.

Practical examples:

  • Transcribing recorded interviews or meetings to text for documentation.
  • Creating captions for videos automatically.
  • Analyzing multi-speaker podcasts by identifying who spoke when.

Properties

Name Meaning
File The audio or video file to transcribe. This should be provided as a string reference to the data.
Additional Options A collection of optional parameters to customize the transcription:
- Model Select the speech-to-text model to use. Can be chosen from a list of available models or specified by ID.
- Language Code ISO 639-1 language code to enforce the language for transcription (default is "en" for English).
- Number of Speakers Maximum number of speakers expected in the audio file (minimum 1).
- Diarize Boolean flag to enable speaker diarization, which annotates which speaker is talking at each time.

Output

The node outputs JSON data containing the transcription results. The structure typically includes the transcribed text and, if diarization is enabled, annotations about speaker segments. The output does not explicitly mention binary data, so it is assumed to be purely textual JSON output representing the transcription.

Dependencies

  • Requires an API key credential for ElevenLabs API authentication.
  • Network access to https://api.elevenlabs.io/v1 endpoint.
  • No additional environment variables are indicated beyond the API key credential.

Troubleshooting

  • Common issues:

    • Invalid or missing API key will cause authentication errors.
    • Unsupported or corrupted audio/video file formats may lead to failed transcription.
    • Specifying an incorrect language code might reduce transcription accuracy.
    • Setting the number of speakers too low or too high can affect diarization quality.
  • Error messages:

    • Authentication failures: Check that the API key credential is correctly configured.
    • File upload or format errors: Verify the input file is accessible and in a supported format.
    • Model selection errors: Ensure the selected model exists and is available in the account.

Links and References

Discussion