Gemini

Interact with Gemini API

Actions2

Text Actions
- Generate Text
Audio Actions
- Generate Audio

Overview

This node integrates with the Gemini API to generate text based on conversational input. It is designed for scenarios where users want to interact with advanced language models to produce natural language responses, continue conversations, or generate content dynamically. Typical use cases include chatbots, content creation, summarization, and interactive assistants.

For example, you can provide a series of messages representing a conversation history, and the node will generate the next message from the model. You can also attach files (like images or documents) to enrich the context of the conversation.

Properties

Name	Meaning
API Key	The API key credential required to authenticate requests to the Gemini API.
Model Name or ID	The specific language model to use for text generation. Choose from a list or specify a custom model ID.
Text Messages	A collection of messages forming the conversation history sent to the model. Each message has: - Prompt: The content of the message. - Role: The sender role, either "User" or "Model".
File Attachments	Optional files to attach to the last user message in the conversation. Each attachment includes: - Base64 encoded file data with MIME type prefix. - MIME type of the file (e.g., image/jpeg, application/pdf).
Simplify Output	Boolean flag to simplify the output by returning only the generated text response instead of full details.
JSON Output	Boolean flag to request the response in JSON format from the API.
Options	Additional parameters to customize generation: - Frequency Penalty: Penalizes repeated tokens. - Max Output Tokens: Maximum tokens to generate. - Presence Penalty: Encourages new tokens. - Safety Settings: Configure content blocking. - System Instruction: Instructions to guide model behavior. - Temperature: Controls randomness in token selection. - Thinking Config: Include thoughts and set thinking budget. - Top K: Number of top tokens to sample. - Top P: Cumulative probability threshold for token selection.

Output

The node outputs JSON data containing the generated text response from the Gemini API. If "Simplify Output" is enabled, the output contains only the plain text of the generated response. Otherwise, it may include additional metadata such as the full JSON response from the API.

If "JSON Output" is enabled, the response is returned in JSON format as provided by the API.

The node does not output binary data.

Dependencies

Requires an active API key credential for the Gemini API.
The node depends on network access to the Gemini API endpoint.
No other external dependencies are required.
The node uses internal helper methods to fetch available models dynamically.

Troubleshooting

Audio resource not supported: Attempting to use the "audio" resource will throw an error because this feature is not implemented yet.
Invalid API key or authentication failure: Ensure the API key is correct and has necessary permissions.
Model not found: If specifying a model ID manually, verify that the model exists and is accessible.
Malformed file attachments: File data must be base64 encoded with a proper MIME type prefix; otherwise, the API may reject the request.
Exceeding token limits: Setting "Max Output Tokens" too high may cause errors or truncated responses.
Safety settings blocking content: Overly strict safety settings might block valid content; adjust thresholds accordingly.

Links and References

Gemini API Documentation
n8n Expressions Documentation
OpenAI-like Language Model Parameters Explained (for understanding penalties and temperature)

GeminiInstall