ElevenLabs

Interact with ElevenLabs API

Actions7

Voice Actions
Speech Actions

Overview

The node integrates with the ElevenLabs API to perform speech-to-text transcription on audio or video files. It converts spoken content within media files into written text, optionally identifying different speakers if diarization is enabled. This node is useful for automating transcription tasks such as generating subtitles, creating searchable archives of audio/video content, or analyzing conversations.

Practical examples:

Transcribing recorded interviews or meetings to text for documentation.
Creating captions for videos automatically.
Analyzing multi-speaker podcasts by identifying who spoke when.

Properties

Name	Meaning
File	The audio or video file to transcribe. This should be provided as a string reference to the data.
Additional Options	A collection of optional parameters to customize the transcription:
- Model	Select the speech-to-text model to use. Can be chosen from a list of available models or specified by ID.
- Language Code	ISO 639-1 language code to enforce the language for transcription (default is "en" for English).
- Number of Speakers	Maximum number of speakers expected in the audio file (minimum 1).
- Diarize	Boolean flag to enable speaker diarization, which annotates which speaker is talking at each time.

Output

The node outputs JSON data containing the transcription results. The structure typically includes the transcribed text and, if diarization is enabled, annotations about speaker segments. The output does not explicitly mention binary data, so it is assumed to be purely textual JSON output representing the transcription.

Dependencies

Requires an API key credential for ElevenLabs API authentication.
Network access to https://api.elevenlabs.io/v1 endpoint.
No additional environment variables are indicated beyond the API key credential.

Troubleshooting

Common issues:
- Invalid or missing API key will cause authentication errors.
- Unsupported or corrupted audio/video file formats may lead to failed transcription.
- Specifying an incorrect language code might reduce transcription accuracy.
- Setting the number of speakers too low or too high can affect diarization quality.
Error messages:
- Authentication failures: Check that the API key credential is correctly configured.
- File upload or format errors: Verify the input file is accessible and in a supported format.
- Model selection errors: Ensure the selected model exists and is available in the account.

Links and References

ElevenLabs API Documentation (for detailed API capabilities and model options)
ISO 639-1 Language Codes (for valid language codes)