Overview
This node converts input text into speech audio using the PiAPI Text-to-Speech (TTS) service. It supports voice cloning by allowing users to provide a reference audio sample, either via URL or binary data, which helps generate speech in a similar voice style. Optionally, users can include the text corresponding to the reference audio to improve voice cloning quality.
Common scenarios for this node include:
- Generating personalized voice messages or announcements.
- Creating audio content from text for accessibility or multimedia projects.
- Voice cloning applications where a specific voice style is desired based on a reference audio sample.
Practical example:
You have a marketing script and want to create an audio advertisement in a particular voice. You provide the script as text and a short reference audio clip of the target voice. The node generates speech audio that mimics the reference voice speaking your script.
Properties
| Name | Meaning |
|---|---|
| Text | The text string to be converted into speech audio. |
| Reference Audio Input Method | Method to provide the reference audio for voice cloning: either a URL or binary data from previous nodes. |
| Reference Audio Binary Property | (If Binary Data selected) Name of the binary property containing the reference audio data. |
| Reference Audio URL | (If URL selected) URL pointing to the reference audio file used for voice cloning. |
| Include Reference Text | Whether to include the text corresponding to the reference audio to improve voice cloning quality. |
| Reference Text | (If Include Reference Text enabled) Text corresponding to the reference audio sample. |
| Wait For Completion | Whether the node should wait until the speech generation task completes before continuing workflow execution. |
| Max Retries | (If waiting for completion) Maximum number of retries to check the status of the speech generation task. |
| Retry Interval | (If waiting for completion) Time interval in milliseconds between each retry when checking task status. |
Output
The node outputs JSON data containing the status and details of the speech generation task. The key fields include:
task_id: Identifier of the speech generation task.status: Current status of the task (e.g., "pending", "completed").- Additional task-related information returned by the API.
If "Wait For Completion" is enabled, the output reflects the final status after polling the task until completion or retry limit reached.
The node does not directly output the generated audio binary data; instead, it provides task metadata. Further steps may be required to retrieve or download the generated audio.
Dependencies
- Requires an active connection to the PiAPI TTS service via an API key credential configured in n8n.
- Uses the PiAPI REST API endpoint
/api/v1/taskto submit text-to-speech tasks. - Relies on helper functions for making authenticated API requests and optionally polling task status until completion.
Troubleshooting
Error: The provided binary data is not an audio file
Occurs if the binary input specified as reference audio is not recognized as an audio MIME type. Ensure the binary data contains valid audio content.Failed to get a valid task ID from the API
Indicates the API response did not return a task identifier. Check API credentials, request payload correctness, and network connectivity.Timeout or incomplete task status
When "Wait For Completion" is enabled, the node polls the task status up to the maximum retries. If the task does not complete in time, consider increasing max retries or retry interval.Invalid or missing reference audio URL
If using URL input method, ensure the URL is accessible and points to a valid audio file.
Links and References
- PiAPI Text-to-Speech Documentation (example placeholder link)
- n8n Documentation - Creating Custom Nodes
- Handling Binary Data in n8n