Actions10
- Audio Actions
- Document Actions
- File Actions
- Image Actions
- Text Actions
- Video Actions
Overview
The node provides an interface to analyze images using a Google Gemini API. It supports analyzing images either by providing URLs or binary files. The node sends the image data along with a text prompt (e.g., "What's in this image?") to a specified model hosted on the Google Gemini service, which returns an analysis of the image content.
This node is useful for scenarios such as:
- Automatically generating descriptions or tags for images.
- Extracting insights or metadata from images for cataloging or search.
- Integrating image understanding capabilities into workflows without manual intervention.
For example, you could input a URL of a product image and receive a textual description that can be used for e-commerce listings or accessibility features.
Properties
| Name | Meaning |
|---|---|
| Server URL | The base URL of the Google Gemini API endpoint to send requests to. |
| API Key | The API key credential required to authenticate requests to the Google Gemini API. |
| Model | The specific model ID to use for image analysis. Can be selected from a list or entered manually. |
| Text Input | A textual prompt or question about the image, e.g., "What's in this image?" |
| Input Type | Specifies whether the input images are provided as URLs or as binary file data. Options: Image URL(s), Binary File(s) |
| URL(s) | One or more comma-separated URLs of images to analyze (used if Input Type is URL). |
| Input Data Field Name(s) | Name(s) of the binary fields containing image data to analyze (used if Input Type is binary). Multiple names separated by commas. |
| Simplify Output | Whether to simplify the response output for easier consumption (true or false). |
| Options | Additional options including: - Length of Description (Max Tokens): Limits the length/detail of the image description generated. |
Output
The node outputs JSON data representing the analysis results returned by the Google Gemini model. This typically includes descriptive information about the image content based on the prompt provided.
If multiple images are analyzed, the output will contain corresponding entries for each image.
If binary input is used, the node processes the binary image data but does not output binary data itself; the output remains JSON describing the analysis.
Dependencies
- Requires access to the Google Gemini API endpoint.
- Requires a valid API key credential for authentication.
- The node expects the user to provide the correct model ID available on the Google Gemini platform.
- Network connectivity to the specified Server URL must be available.
Troubleshooting
- Invalid API Key or Authentication Errors: Ensure the API key is valid, active, and has permissions to access the Google Gemini API.
- Incorrect Model ID: Verify the model ID exists and is accessible under your account.
- Malformed Image URLs or Binary Data: Check that image URLs are reachable and correctly formatted; for binary input, ensure the field names match those containing the image data.
- Timeouts or Network Issues: Confirm network connectivity to the API server URL.
- Exceeding Max Tokens Limit: If the description is too long or truncated, adjust the "Length of Description" option accordingly.
Links and References
- Google Gemini API Documentation
- n8n Documentation on Creating Custom Nodes
- Image Analysis Concepts (for general understanding of image analysis APIs)