Overview
This node converts input text into speech audio using the Deepgram Speak API. It supports multiple voice models and customizable audio output options, making it useful for generating speech for applications like voice assistants, automated announcements, or accessibility tools. Users provide text and select a voice model, then receive synthesized audio as binary data.
Use Case Examples
- Convert customer support responses into spoken audio for phone systems.
- Generate audio narration for e-learning content using different voice models.
- Create personalized voice messages for marketing campaigns.
Properties
| Name | Meaning |
|---|---|
| Text to Speak | The text string to be converted into speech, limited to 2000 characters. |
| Voice Model | Selects the voice model for speech synthesis, such as different genders and accents. |
| Audio Options | Settings to configure the output audio format including encoding, container, sample rate, and bit rate. |
| Output Binary Property | The name of the binary property where the generated audio data will be stored. |
| Output Filename | Optional filename for the output audio file; if empty, a default name is generated. |
Output
JSON
json- The original input JSON data passed through the node.binary- Binary data containing the generated audio file.
Dependencies
- Deepgram API key credential for authentication
Troubleshooting
- Ensure the input text is not empty and does not exceed 2000 characters to avoid errors.
- Verify that the Deepgram API key credential is correctly configured and has necessary permissions.
- If the node returns a non-audio response, check the API usage limits and error messages from Deepgram.
- Make sure the audio options are valid and supported by the Deepgram API to prevent format errors.
Links
- Deepgram Text-to-Speech API Documentation - Official documentation for Deepgram's Text-to-Speech API used by this node.