DeepInfra icon

DeepInfra

Use DeepInfra API for AI operations

Actions6

Overview

This node integrates with the DeepInfra API to perform speech recognition tasks, specifically transcription and translation of audio files. It supports input audio either via a direct URL or binary data from previous nodes. The node uses OpenAI Whisper models for processing the audio.

Typical use cases include:

  • Translating spoken content in audio files into text in another language.
  • Converting audio speech into text transcripts for further analysis or storage.
  • Processing audio from URLs or directly from binary data within n8n workflows.

For example, you can provide an audio file URL containing a foreign language speech and get back its English translation text, or upload recorded audio as binary data and receive a transcript.

Properties

Name Meaning
Model The speech recognition model to use. Options: "Openai Whisper-Large-V3", "Openai Whisper-Large-V3-Turbo"
Input Type Specifies whether the audio input is provided as a URL or as binary data. Options: "URL", "Binary Data"
Audio URL (Required if Input Type is URL) The URL of the audio file to transcribe or translate
Binary Property (Required if Input Type is Binary Data) The name of the binary property containing the audio data
Options Additional optional parameters:
- Language The language code of the audio (ISO-639-1 format), used to guide transcription
- Prompt Optional text prompt to influence the model's style or continue a previous segment
- Temperature Sampling temperature for transcription, between 0 and 1, controlling randomness

Output

The node outputs JSON data representing the result of the speech recognition operation:

  • For Translate operation, the output JSON contains the translated text returned by the DeepInfra API.
  • The structure matches the API response from DeepInfra's audio translation endpoint.
  • No binary output is produced by this operation.

Example output snippet (simplified):

{
  "text": "Translated text of the audio content"
}

Dependencies

  • Requires an active API key credential for the DeepInfra API.
  • Uses the DeepInfra OpenAI-compatible API endpoint at https://api.deepinfra.com/v1/openai.
  • Node depends on these npm packages bundled internally:
    • openai for API client
    • axios for HTTP requests
    • fs, path, and os modules for temporary file handling

Troubleshooting

  • Common issues:

    • Invalid or missing API key will cause authentication errors.
    • Providing an invalid or inaccessible audio URL will result in download failures.
    • Incorrect binary property name may cause the node to fail reading audio data.
    • Unsupported audio formats or corrupted files might lead to API errors.
  • Error messages:

    • Network or HTTP errors when fetching audio URL: check URL accessibility and network connection.
    • API errors related to model usage or parameters: verify model selection and options.
    • File system errors during temporary file creation/deletion: ensure proper permissions on temp directory.
  • Resolutions:

    • Confirm API key validity and permissions.
    • Verify audio URL correctness or binary data presence.
    • Use supported audio formats (commonly mp3).
    • Ensure n8n has write access to the OS temp directory.

Links and References

Discussion