nvidia-nim-whisper-v2

n8n community node for NVIDIA NIM Whisper Large V3 – speech recognition and translation via Riva gRPC API

Package Information

Downloads: 7 weekly / 40 monthly
Latest Version: 0.1.3

Documentation

n8n-nodes-nvidia-nim-whisper

n8n community node for NVIDIA NIM Whisper Large V3 – speech-to-text transcription and translation powered by the NVIDIA Riva gRPC API.

Features

Operation Description
Transcribe Convert audio to text in the source language
Translate Translate audio to English text
  • Supports WAV (Mono 16-bit), OPUS and FLAC audio formats
  • Auto language detection (multi) or 30+ explicit language codes
  • Configurable sample rate, punctuation, word-time offsets, alternatives
  • Works with NVIDIA cloud API and self-hosted NIM instances

Prerequisites

Requirement Details
n8n >= 1.0
Node.js >= 18
NVIDIA API key Get one here

Installation

Community node (recommended)

  1. Open Settings → Community Nodes in your n8n instance
  2. Search for n8n-nodes-nvidia-nim-whisper
  3. Install

Manual

cd ~/.n8n/nodes
npm install n8n-nodes-nvidia-nim-whisper

Then restart n8n.

Credentials

Create an NVIDIA NIM API credential with:

Field Default Description
API Key Your NVIDIA API key (Bearer token)
Server Address grpc.nvcf.nvidia.com:443 gRPC endpoint. Change for self-hosted NIM.
Use SSL true Disable for local instances without TLS

Usage

  1. Add a node that produces binary audio data (e.g. Read Binary File, HTTP Request, etc.)
  2. Connect it to the NVIDIA NIM Whisper node
  3. Configure:
    • Operation – Transcribe or Translate
    • Input Data Field Name – binary property name (default: data)
    • Language – source language or Auto Detect
  4. Execute the workflow – the node returns the transcribed text in the text field

Translate example

To translate French audio to English:

  1. Set Operation to Translate
  2. Set Language to French
  3. The output text field will contain the English translation

Options

Option Default Description
Audio Encoding Auto Detect Override auto-detection (LINEAR_PCM, FLAC, OGGOPUS, …)
Enable Punctuation true Automatic punctuation
Enable Word Time Offsets false Include word-level timestamps
Function ID b702f636-… NVIDIA NIM function ID (cloud API)
Max Alternatives 1 Number of alternative transcriptions
Sample Rate (Hz) 16000 Input audio sample rate
Timeout (Seconds) 60 gRPC deadline
Verbatim Transcripts false Raw transcription without normalization

Development

npm install
npm run build

License

MIT

Discussion