Package Information
Available Nodes
Documentation
n8n-nodes-transcribe-audio
This is an n8n community node. It lets you perform speech-to-text on audio files within your n8n workflows. This node provides local audio transcription; no internet or third-party APIs are required for processing.
It utilizes Hugging Face Transformers.js and Whisper models to transcribe audio.
n8n is a fair-code licensed workflow automation platform.
Installation
Operations
Models
Credentials
Compatibility
Usage
Resources
Installation
Follow the installation guide in the n8n community nodes documentation.
Operations
- Transcribe: Takes an audio file (binary input) and returns the transcribed text. It can handle various audio formats (e.g., MP3, WAV) by converting them to the required WAV format (16kHz, 16-bit PCM, mono) using FFmpeg.
Models
The node allows you to select from a list of pre-configured Xenova Whisper models:
Xenova/whisper-tiny.enXenova/whisper-base.enXenova/whisper-small.enXenova/whisper-medium.en
Larger models generally provide better accuracy but require more processing power and time.
Credentials
This node does not require any credentials.
Compatibility
- n8n Version: Tested with n8n versions
1.0.0and above. - Node.js Version: Requires Node.js version
>=20.15as specified in thepackage.json.
Usage
- Input: Provide an audio file via a binary property (default:
data). - Binary Property Name: Specify the name of the binary property containing the audio data if it's not
data. - Model Selection: Choose the desired Whisper model for transcription.
- Output: The node will output the transcribed text in
json.transcriptionand potentially other related information.
Ensure FFmpeg is installed and accessible in your n8n environment if you plan to process audio formats other than WAV, as the node relies on it for audio conversion.