Gemini Agent (Custom)

A self-contained AI agent for Google Gemini models with S3 memory

Overview

This node implements a custom AI agent using Google Gemini models with optional S3/MinIO storage for long-term memory. It processes user prompts or multimodal inputs, interacts with connected AI tools and memory nodes, and generates AI responses. It is useful for building conversational AI assistants that can leverage external tools and maintain context over time, with support for storing and retrieving conversation data and files in S3 storage.

Use Case Examples

A customer support chatbot that remembers past interactions and uses external tools to fetch data.
An AI assistant that processes text and image inputs, stores conversation history in S3, and generates context-aware responses.

Properties

Name	Meaning
Model	The specific Google Gemini model to use for generating AI responses.
S3 Storage for Memory	Configuration for S3 or MinIO to store files for long-term memory, including bucket name, endpoint URL, and region.
System Prompt	The system-level prompt that guides the AI assistant's behavior and instructions.
Enable Multimodal Input	Flag to enable or disable multimodal input processing (e.g., text plus files).
Context IDs	Identifiers for organizing files in S3 storage, including client ID and dialogue ID to structure stored data.
User Prompt	The user input prompt text when multimodal input is disabled.
Options	Additional generation options such as temperature and maximum output tokens.

Output

JSON

result - The generated AI response text from the Gemini model.
usage
- inputTokens - Number of tokens used in the input prompt.
- outputTokens - Number of tokens used in the output response.
- totalTokens - Total tokens consumed in the request.

Dependencies

Google Palm API key credential for accessing Gemini models
Optional S3/MinIO credentials for memory storage

Troubleshooting

Ensure the Google Palm API key is valid and has sufficient quota to avoid authentication or rate limit errors.
If using S3 storage, verify the bucket name, endpoint URL, region, and credentials are correct to prevent connection or permission errors.
User Prompt must not be empty when multimodal input is disabled; otherwise, the node throws an error.
The Gemini API may return no candidates or unexpected response formats; check network connectivity and API version compatibility.
If multimodal input is enabled, ensure binary data is properly provided and accessible to avoid errors in file processing and upload.

Gemini Agent (Custom)Install