Google Vertex CAMB AI

Interact with Google Vertex AI MARS7 for text-to-speech synthesis

Overview

This node integrates with Google Vertex AI MARS7 to perform text-to-speech synthesis using voice cloning technology. It converts input text into spoken audio, mimicking the voice characteristics of a reference audio file provided by the user. This is particularly useful for applications requiring personalized or branded voice outputs, such as virtual assistants, audiobooks, accessibility tools, or automated announcements.

For example, you can input a paragraph of text and provide a URL to an audio sample of a specific speaker’s voice. The node will generate speech audio that sounds like the referenced voice speaking the input text.

Properties

Name	Meaning
Text	The text content that will be converted into speech.
Audio URL	URL pointing to the reference audio file used for voice cloning (to mimic the voice).
Language	Target language/accent for the synthesized speech. Options include:
	- Chinese (China)
	- English (UK)
	- English (US)
	- French (Canada)
	- French (France)
	- German (Germany)
	- Japanese (Japan)
	- Korean (South Korea)
	- Spanish (Spain)
	- Spanish (US)
Additional Options	Optional extra settings:
- Reference Text	An optional transcription of the reference audio to improve voice cloning quality.

Output

The node outputs JSON data containing the result of the text-to-speech operation. Typically, this includes:

The synthesized speech audio data or a link/reference to it.
Metadata about the synthesis process or audio format.

If binary data output is supported, it would represent the generated audio file in a suitable format (e.g., WAV or MP3), ready for playback or further processing.

Dependencies

Requires an API key credential for authenticating with Google Vertex AI services.
Internet access to reach Google Vertex AI endpoints.
The reference audio must be accessible via a publicly reachable URL.

Troubleshooting

Invalid or inaccessible Audio URL: Ensure the URL points to a valid audio file accessible from the internet. Private or protected URLs may cause failures.
Unsupported language code: Select one of the supported languages listed in the properties.
API authentication errors: Verify that the API key credential is correctly configured and has necessary permissions.
Empty or invalid text input: Provide non-empty text to synthesize.
Reference Text mismatch: If using the optional reference text, ensure it accurately transcribes the reference audio to improve cloning quality.