ElevenLabs

Generate natural-sounding speech using ElevenLabs AI

Actions13

Speech Actions
- Text to Speech
- Speech to Speech
Voice Actions
History Actions
User Actions
- Get User Info
- Get User Subscription

Overview

The ElevenLabs Speech to Speech operation enables transforming input speech audio into a new speech audio output using AI voice cloning and synthesis technology. This node is useful for scenarios such as voice dubbing, creating personalized voice assistants, or generating speech in different styles or languages while preserving the original speaker's characteristics.

For example, you can input an audio clip of a person speaking and generate a new audio clip where the same text is spoken with a different style, clarity, or language accent. It supports fine-tuning voice parameters like similarity to the original voice, stability, style, and noise reduction.

Properties

Name	Meaning
Voice Name or ID	Select a voice from a list or specify a voice ID to use for speech synthesis.
Model Name or ID	Select a speech model from a list or specify a model ID. The "Turbo v2.5" model supports language codes; others do not.
Binary Name	Name of the binary property where the generated audio data will be stored (default: "data").
File Name	Name of the generated audio file (default: "voice").
Output Format	Audio format of the generated speech. Options include FLAC (16kHz or 24kHz), MP3 (44.1KHz at 128kbps or 64kbps), MULAW (16kHz), and WAVE (16kHz or 24kHz).
Similarity Boost	How closely the generated voice matches the original voice (range 0-1). 0 means more freedom, 1 means very similar.
Stability	Controls variation across re-generations of the voice (range 0-1). 0 means more variable, 1 means more stable.
Style	Amount of style applied to the voice (range 0-1). 0 is neutral, 1 is maximum style.
Speaker Boost	Boolean to enhance voice clarity and reduce background noise.
Streaming Latency	Optimize streaming latency with options from no optimization (best quality) to maximum optimization (lowest latency but potentially lower quality). Includes a mode that disables text normalization for fastest response.
Text Normalization	Controls how text is normalized before generation. Options are Auto (system decides), On (always normalize), Off (never normalize). Cannot be enabled for Turbo v2.5 model.
Language Code	ISO 639-1 language code (e.g., "en", "de", "fr"). Only supported by Turbo v2.5 model.
Next Text	Text that follows the current text, used to improve prosody when concatenating multiple generations.
Previous Text	Text that precedes the current text, also used to improve prosody.
Seed	Numeric seed (0 to 4294967295) to fix randomness for consistent voice output.

Output

The node outputs the generated speech audio as binary data attached to the specified binary property (default name: "data"). The binary data contains the audio file in the selected output format (e.g., MP3, FLAC, WAV). The JSON output typically includes metadata about the generation request and may contain information such as the voice ID, model ID, and any relevant status messages.

Dependencies

Requires an API key credential from ElevenLabs to authenticate requests.
The node communicates with the ElevenLabs API endpoint at https://api.elevenlabs.io/v1.
Proper configuration of the API key credential in n8n is necessary.
The user must select valid voice and model IDs available via the ElevenLabs service.

Troubleshooting

Error due to unsupported language code: If a language code is provided with a model other than Turbo v2.5, the API will return an error. Solution: Use language codes only with Turbo v2.5 model.
Invalid voice or model ID: Selecting or specifying an invalid voice or model ID will cause failures. Solution: Use the provided load options to select valid voices/models or verify IDs.
API authentication errors: Missing or incorrect API key will result in authentication failures. Ensure the API key credential is correctly configured.
Audio format issues: Specifying an unsupported output format or mismatch between format and usage might cause problems. Use one of the supported formats listed.
Latency vs Quality tradeoff: Using aggressive streaming latency optimizations may degrade audio quality. Adjust the "Streaming Latency" option accordingly.
Text normalization conflicts: Enabling text normalization on Turbo v2.5 model is not allowed and will cause errors. Set text normalization to "Off" or "Auto" as appropriate.

ElevenLabs

Actions13

Overview

Properties

Output

Dependencies

Troubleshooting

Links and References

Discussion

ElevenLabsInstall

Actions13

Overview

Properties

Output

Dependencies

Troubleshooting

Links and References

Discussion

ElevenLabs