FreeLim icon

FreeLim

逆向LLM

Overview

This node provides document and image parsing capabilities under the resource named "跃问" with an operation focused on 文档/图片解析 (document/image analysis). It is designed to analyze images either by URL or binary data input, optionally using a specified model or intelligent agent. The node can simplify the output response and supports text input queries related to the image content.

Common scenarios where this node would be beneficial include:

  • Extracting information or understanding content from images hosted online.
  • Processing binary image files directly within workflows.
  • Using AI models or agents to interpret images and provide textual answers.
  • Integrating voice synthesis options for audio output of results.

Practical examples:

  • Uploading product photos via URLs to extract descriptions or features.
  • Sending scanned documents as base64 binary data for text extraction.
  • Asking questions about an image, e.g., "What is in this picture?" and receiving analyzed responses.
  • Generating spoken feedback using selectable voice profiles.

Properties

Name Meaning
模型 (assistantId) Model or intelligent agent identifier; can be left blank if unknown.
文本输入 (text) Text input query related to the image, e.g., "What is this image?" (required for certain resources).
输入类型 (inputType) Input type for the image: "图片链接" (URL) or "二进制文件" (base64 binary file).
URL链接 (imageUrls) Comma-separated list of image URLs to analyze (required if inputType is URL).
输入数据字段名称 Name of the binary property field containing image data (default "data", used if inputType is base64).
简化输出 (simplify) Boolean flag to simplify the response output (default true).
语音列表 (voice) Voice selection for text-to-speech output; supports official voices or cloned voices.

Output

The node outputs JSON data representing the analysis results of the provided images or documents. The structure depends on the selected model and operation but generally includes parsed textual information derived from the images.

If binary data output is supported (e.g., synthesized speech), it will represent audio data corresponding to the selected voice profile.

Dependencies

  • Requires access to external AI models or intelligent agents capable of image/document analysis.
  • Needs network access to fetch images from URLs if inputType is URL.
  • For voice synthesis, requires integration with a text-to-speech service supporting multiple voice profiles.
  • Proper API authentication tokens or credentials must be configured in n8n to enable these services.

Troubleshooting

  • Invalid URL or inaccessible image: Ensure that the image URLs are correct, publicly accessible, and properly formatted.
  • Missing binary data field: When using base64 input, verify that the binary property name matches the actual input data field.
  • Unsupported model or agent: If the specified model is invalid or unavailable, the node may fail; try leaving the model field empty or selecting a supported one.
  • Voice synthesis errors: If TTS fails, check that the voice selection is valid and that the TTS service credentials are correctly set up.
  • Simplify output issues: If simplified output does not contain expected details, try disabling simplification to get full raw responses.

Links and References

  • No direct external links are provided in the source code.
  • For voice options, the node supports a predefined list of Chinese and English voices, including official and cloned voices.
  • Users should refer to their AI service provider's documentation for detailed API usage and credential setup.

Discussion