Overview
This node implements a custom AI agent using Google Gemini models with optional S3/MinIO storage for long-term memory. It processes user prompts or multimodal inputs, interacts with connected AI tools and memory nodes, and generates AI responses. It is useful for building conversational AI assistants that can leverage external tools and maintain context over time, with support for storing and retrieving conversation data and files in S3 storage.
Use Case Examples
- A customer support chatbot that remembers past interactions and uses external tools to fetch data.
- An AI assistant that processes text and image inputs, stores conversation history in S3, and generates context-aware responses.
Properties
| Name | Meaning |
|---|---|
| Model | The specific Google Gemini model to use for generating AI responses. |
| S3 Storage for Memory | Configuration for S3 or MinIO to store files for long-term memory, including bucket name, endpoint URL, and region. |
| System Prompt | The system-level prompt that guides the AI assistant's behavior and instructions. |
| Enable Multimodal Input | Flag to enable or disable multimodal input processing (e.g., text plus files). |
| Context IDs | Identifiers for organizing files in S3 storage, including client ID and dialogue ID to structure stored data. |
| User Prompt | The user input prompt text when multimodal input is disabled. |
| Options | Additional generation options such as temperature and maximum output tokens. |
Output
JSON
result- The generated AI response text from the Gemini model.usageinputTokens- Number of tokens used in the input prompt.outputTokens- Number of tokens used in the output response.totalTokens- Total tokens consumed in the request.
Dependencies
- Google Palm API key credential for accessing Gemini models
- Optional S3/MinIO credentials for memory storage
Troubleshooting
- Ensure the Google Palm API key is valid and has sufficient quota to avoid authentication or rate limit errors.
- If using S3 storage, verify the bucket name, endpoint URL, region, and credentials are correct to prevent connection or permission errors.
- User Prompt must not be empty when multimodal input is disabled; otherwise, the node throws an error.
- The Gemini API may return no candidates or unexpected response formats; check network connectivity and API version compatibility.
- If multimodal input is enabled, ensure binary data is properly provided and accessible to avoid errors in file processing and upload.