Overview
This node converts speech audio data into text using a speech-to-text (STT) service. It accepts audio input encoded in Base64 format and sends it to an external API that performs the transcription. The node is useful for automating transcription tasks, such as converting voice notes, recorded calls, or any audio content into searchable and editable text.
Practical examples include:
- Transcribing customer support calls for analysis.
- Converting podcast audio into text for publishing transcripts.
- Automating note-taking from voice memos.
Properties
| Name | Meaning |
|---|---|
| Audio (Base64) | Base64 encoded audio data to be transcribed by the speech-to-text service |
Output
The output is a JSON object containing the transcription result returned by the external STT API. Each item corresponds to one input audio and includes the full response body from the API under the json field.
If an error occurs during the request, the output will contain an error field with the error message.
The node does not output binary data.
Dependencies
- Requires an API token credential for authenticating requests to the external speech-to-text service.
- Requires the API domain URL configured in the credentials to send requests to the correct endpoint.
- The node makes HTTP POST requests to the
/speech/sttendpoint of the configured API domain.
Troubleshooting
Common issues:
- Invalid or missing API token will cause authentication failures.
- Incorrect or unreachable API domain URL will cause network errors.
- Malformed or empty Base64 audio input may lead to API errors or empty transcriptions.
Error messages:
"Error stt request: ..."indicates a failure during the API call. Check the API token validity, network connectivity, and correctness of the audio data.- If the node is set to continue on fail, errors are returned in the output JSON under the
errorkey instead of stopping execution.
Links and References
- Refer to your speech-to-text service provider’s API documentation for details on accepted audio formats, limits, and response structure.
- n8n documentation on creating custom nodes for further customization guidance.