ElevenLabs icon

ElevenLabs

WIP

Overview

The ElevenLabs node provides advanced audio processing capabilities, including speech-related operations such as audio isolation. Specifically, the Audio Isolation operation isolates vocals or speech from an input audio file, effectively separating the voice from background sounds or music. This is useful in scenarios like cleaning up podcast recordings, enhancing voice clarity in interviews, or preparing audio for transcription by removing noise.

Practical examples:

  • Removing background music from a recorded interview to focus on the speaker's voice.
  • Isolating vocals from a song for remixing or karaoke purposes.
  • Enhancing speech clarity before feeding audio into a speech-to-text engine.

Properties

Name Meaning
Binary Input Field Name of the binary property containing the audio file to isolate vocals/speech from (required).
Additional Fields Optional extra parameters (not used directly in Audio Isolation but available for other operations).

Additional Fields (general for Speech resource, not specific to Audio Isolation)

  • Binary Name: Change output binary name (for some operations).
  • File Name: Change output file name (for some operations).
  • Streaming Latency: Optimize latency at some quality cost (for TTS and voice changer).
  • Output Format: Choose output audio format (for TTS, voice changer, sound generation).
  • Language Code, Model ID, Stability, Similarity Boost, Style, Speaker Boost, Seed, Enable Logging, Text Normalization, Use PVC as IVC, Stitching, Previous/Next Request IDs, Remove Background Noise, Transcript Model ID, Transcript Language Code, Tag Audio Events, Number of Speakers, Timestamps Granularity, Speaker Diarization, Duration, Prompt Influence — these are relevant to other operations like text-to-speech, voice changer, speech-to-text, or sound generation, not Audio Isolation.

Output

  • The node outputs the isolated audio data as binary in the specified binary property (default "data").
  • The binary data contains the audio with vocals/speech isolated from the original input.
  • The output is returned as an arraybuffer from the API, which n8n stores as binary data.
  • No JSON output structure is explicitly defined for this operation since the main result is binary audio data.

Dependencies

  • Requires an active API key credential for ElevenLabs API.
  • The node sends requests to https://api.elevenlabs.io/v1/audio-isolation.
  • The input audio must be provided as binary data in the specified binary input field.
  • No additional environment variables are required beyond the API key.

Troubleshooting

  • Common issues:

    • Missing or incorrect binary input field name: Ensure the binary property containing the audio file is correctly named and exists.
    • Invalid or expired API key: Verify the API key credential is valid and has access to ElevenLabs services.
    • Unsupported audio formats: Confirm the input audio format is supported by the ElevenLabs API.
    • Large audio files may cause timeouts or slow responses.
  • Error messages:

    • Authentication errors: Check API key validity.
    • 4xx or 5xx HTTP errors: Review request payload and network connectivity.
    • Empty or corrupted output: Verify input binary data integrity.

Links and References

Discussion