Zalo User Interact

Gửi tin nhắn và tương tác với Zalo User

Overview

This node provides a "Create TTS" (Text-to-Speech) operation under the "Tool" resource. It converts input text into speech audio using selectable voice parameters such as voice type, speaking rate, volume, and pitch. This is useful for automating audio content generation from text, enabling applications like voice notifications, accessibility features, or multimedia content creation.

Common scenarios:

Generating spoken versions of articles or messages.
Creating voice prompts for IVR systems or chatbots.
Producing audio files for podcasts or e-learning materials.
Enhancing accessibility by converting text to speech.

Practical example:
You provide a Vietnamese text string and select a Vietnamese male neural voice with customized speed, volume, and pitch. The node outputs an audio file representing the spoken text, which can then be used in your workflow or saved for playback.

Properties

Name	Meaning
Text	The text content to convert into speech audio.
Voice	The voice profile used for synthesis. Options are dynamically loaded via `getVoices`.
Rate	Speaking rate adjustment, e.g., "0%" for normal speed, positive or negative percentages.
Volume	Volume adjustment, e.g., "0%" for default volume, positive or negative percentages.
Pitch	Pitch adjustment, e.g., "0Hz" for default pitch, positive or negative values allowed.

Output

The node outputs JSON data containing the generated speech audio. The key output field includes:

json: Contains metadata about the generated audio.
binary: Holds the actual audio file data encoded in binary format, suitable for saving or further processing.

The audio format and encoding details depend on the underlying TTS service but typically include standard audio formats like MP3 or WAV.

Dependencies

Requires access to an external Text-to-Speech API or service that supports voice selection and audio parameter adjustments.
Needs proper API authentication credentials configured in n8n (e.g., an API key or token).
Uses dynamic loading of available voices via a method named getVoices.
May require Node.js modules for handling buffers and file system operations internally.

Troubleshooting

Common issues:

Invalid or empty text input: The node requires non-empty text; ensure the "Text" property is provided.
Unsupported voice selection: If the selected voice is not available or incorrectly specified, the node may fail.
API authentication errors: Missing or invalid API credentials will cause authorization failures.
Parameter formatting errors: Incorrectly formatted rate, volume, or pitch values may lead to synthesis errors.
Network or service downtime: External TTS service unavailability will result in request failures.

Error messages and resolutions:

Authentication failed: Verify that the API key or token is correctly set up in n8n credentials.
Voice not found: Choose a valid voice from the dynamically loaded options.
Invalid parameter value: Check that rate, volume, and pitch follow expected formats (e.g., "0%", "0Hz").
Empty text error: Provide valid text input to synthesize.
Request timeout or network error: Ensure stable internet connection and that the TTS service endpoint is reachable.

Links and References

Note: The source code was heavily obfuscated, so this summary is based on static analysis of the provided properties and typical TTS node patterns without runtime execution or internal credential names.