Coze

Interact with Coze AI platform

Actions12

Audio Actions
- Text to Speech
- Audio Transcription
Chat Actions
File Actions
- Upload
- Retrieve
Workflow Actions
- Run
Workspace Actions
- List
- List Bots

Overview

This node integrates with the Coze AI platform to convert text into speech audio files. It is designed for scenarios where users want to generate spoken audio from textual content, such as creating voiceovers, accessibility features, or automated announcements. For example, you can input a product description and receive an MP3 audio file of that description read aloud in a chosen voice.

Properties

Name	Meaning
Authentication	Method to authenticate with the Coze API: either using a Service Token or OAuth2 authentication.
Input Text	The UTF-8 encoded text (up to 1024 bytes) that will be synthesized into speech.
Voice Name or ID	The voice to use for speech synthesis. Choose from a list of available voices or specify an ID via expression.
Response Format	The audio encoding format for the output file. Options: MP3, WAV, PCM, Opus (OGG container).
Speed	The speed multiplier for the speech, ranging from 0.2 (slow) to 3 (fast).
Sample Rate	The sample rate of the generated audio in Hz. Options include 8000, 16000, 22050, 24000, 32000, 44100, 48000.

Output

The node outputs JSON data containing the synthesized audio content. The audio is provided in the selected encoding format (e.g., MP3, WAV). The output includes the audio data typically as binary content suitable for further processing or saving as a file. This allows downstream nodes to handle the audio, such as uploading it, playing it, or storing it.

Dependencies

Requires access to the Coze AI platform API.
Needs valid authentication credentials configured in n8n, either via a service token or OAuth2.
Network connectivity to https://api.coze.cn is necessary.

Troubleshooting

Authentication errors: Ensure that the API key or OAuth2 credentials are correctly set up and have not expired.
Input text too long: The input text must be UTF-8 encoded and no longer than 1024 bytes; exceeding this limit may cause errors.
Unsupported voice ID: If the specified voice ID is invalid or unavailable, the request will fail. Use the "Voice Name or ID" dropdown or verify the ID via expressions.
Invalid response format or sample rate: Selecting unsupported combinations might result in errors; stick to the provided options.
Network issues: Connectivity problems to the Coze API endpoint will prevent execution.

Links and References

Coze AI Platform Documentation (for API details and voice options)
n8n Expressions Documentation (for dynamic parameter values)

CozeInstall