Overview
The "Analyze Media" operation of the Gemini AI Studio node enables you to analyze images, videos, or audio files using Google's Gemini API. By providing a media URL and a custom analysis prompt, you can leverage advanced AI models to extract insights, generate descriptions, or perform content analysis on various types of media. This is particularly useful for automating tasks such as image captioning, video summarization, audio transcription, or content moderation in workflows.
Practical examples:
- Automatically generating alt text for uploaded images.
- Summarizing the content of a video file for documentation.
- Analyzing audio clips for sentiment or keyword extraction.
- Moderating user-uploaded media for policy compliance.
Properties
| Name | Type | Meaning |
|---|---|---|
| Model | options | The AI model to use for analysis. Different models offer varying capabilities, speed, and cost efficiency. |
| Media URL Field | string | The field in the input data that contains the URL of the image, video, or audio file to be analyzed. |
| Media Type | options | Specifies the type of media to analyze: Image, Video, or Audio. |
| Analysis Prompt | string | The prompt or instruction sent to the model along with the media, guiding the type of analysis or response required. |
| Advanced Parameters | collection | A group of optional parameters to fine-tune the model's behavior (see below for details). |
Advanced Parameters (within "Advanced Parameters" collection):
| Name | Type | Meaning |
|---|---|---|
| Temperature | number | Controls randomness in output (0 = deterministic, 2 = maximum randomness). |
| Top P | number | Nucleus sampling; considers tokens with top_p probability mass (e.g., 0.95 = 95% probability mass). |
| Top K | number | Considers only the top-k tokens (1 = greedy decoding). |
| Maximum Output Tokens | number | Maximum number of tokens to generate in the response. |
| Stop Sequences | string[] | Comma-separated sequences that will cause the model to stop generating further output. |
| Tool Selection (Choose One) | options | Enables special capabilities: None, Structured Output (JSON), Code Execution, or Google Search grounding. Only one at a time. |
| JSON Schema Definition | string | OpenAPI schema object to constrain the model output (shown if "Structured Output" is selected). |
| System Prompt | string | Sets the overall behavior or context for the model. |
| Safety Settings | fixedCollection | Allows setting safety categories and thresholds (e.g., block hate speech, explicit content, etc.). |
| Response Format (Legacy) | options | Desired format of the response: Auto, JSON, Text, or Markdown (shown if no tool is selected). |
| Enable Google Search Grounding (Legacy) | boolean | Whether to enable Google Search grounding (only for certain models, not combinable with other tools). |
| Enable Code Execution (Legacy) | boolean | Whether to enable code execution capabilities (only for certain models, not combinable with other tools). |
Output
The node outputs an array of items, each containing the original input fields plus a new field:
{
...originalInputFields,
"geminiResponse": {
// The full response from the Gemini API.
// Structure depends on the model, prompt, and advanced settings.
// Typically includes generated text, and may include additional metadata or structured data if requested.
}
}
- The
geminiResponsefield contains the raw response from the Gemini API, which may include:- Generated text or structured data based on your prompt and settings.
- If "Structured Output (JSON)" is enabled, this will be a valid JSON object matching your schema.
- For image generation models, it may also include binary data references (summarized as image output).
Dependencies
- External Service: Requires access to the Google Gemini API.
- Credentials: You must configure a credential named
geminiApiin n8n, containing:apiKey: Your Gemini API key.apiEndpoint(optional): Custom endpoint; defaults tohttps://generativelanguage.googleapis.com/v1beta.
- n8n Dependency: Relies on the
axioslibrary for HTTP requests (bundled with the node).
Troubleshooting
Common Issues:
Missing or Invalid Media URL: If the specified media URL field is empty or invalid, the node will throw an error:
"Image URL is required for image analysis"- Resolution: Ensure the input data contains a valid URL in the specified field.
API Errors: If the Gemini API returns an error, the node will throw an error message like:
"Gemini API Error: [status] - [error message]"- Resolution: Check your API key, endpoint, quota, and ensure the request parameters are valid.
Invalid JSON Schema: If you provide an invalid JSON schema for structured output, the node logs an error but continues.
- Resolution: Validate your JSON schema before use.
Model/Tool Compatibility: Some features (like code execution or Google Search) are only available for specific models.
- Resolution: Refer to the model documentation and select compatible options.
Links and References
- Google Gemini API Documentation
- OpenAPI Specification for JSON Schema
- n8n Documentation: Credentials
- Gemini Model Capabilities
Note: Always review the Gemini API documentation for the latest supported features, model limitations, and best practices.