Actions21
Overview
This node integrates with the 秘塔AI platform to perform document and image parsing operations. It allows users to leverage AI models (referred to as "models" or "intelligent agents") for analyzing and extracting information from documents or images. A notable feature is the support for text-to-speech (TTS) voices, enabling audio output generation from parsed content.
Common scenarios where this node is beneficial include:
- Extracting structured data or insights from scanned documents or images.
- Automating content analysis workflows that require AI-powered understanding.
- Generating spoken versions of parsed text using selectable voice profiles.
For example, a user might upload an image containing a contract and use this node to extract key terms automatically, then convert the extracted text into speech using a preferred voice.
Properties
| Name | Meaning |
|---|---|
| 模型 | The AI model or intelligent agent used for parsing. Options vary depending on the resource: |
| - For resource "mit": options include 全网-简洁, 全网-深入, 全网-研究, 学术-简洁, 学术-深入, 学术-研究 | |
| 暂不支持 | A notice indicating that certain features are currently unsupported when operation is "file". |
| 语音列表 | Selection of TTS voices for audio output. Multiple modes available: |
| - 官方发音人 (official voices): a list of predefined voice options such as 思远, 心悦, 子韬, 灵儿, etc. | |
| - 克隆发音人 (cloned voices): custom voice cloning input as string. |
Output
The node outputs JSON data representing the results of the document or image parsing operation performed by the selected AI model. The exact structure depends on the API response from the 秘塔AI service but generally includes parsed text, metadata, and possibly structured data extracted from the input.
If text-to-speech is enabled, the node can also output audio data corresponding to the parsed content, using the selected voice profile. This binary data represents synthesized speech audio.
Dependencies
- Requires access to the 秘塔AI platform API.
- Needs configuration of appropriate API authentication credentials (e.g., an API key or token).
- For TTS voice selection, the node queries a voice list method internally to provide available official and cloned voices.
- No other external dependencies are indicated in the provided source code.
Troubleshooting
Common issues:
- Incorrect or missing API credentials will cause authentication failures.
- Selecting unsupported operations or resources may result in notices or errors.
- Using the "file" operation under resource "mit" currently shows a notice that it is not supported.
- If TTS voices are not loading, ensure the voice list search method is functioning and network connectivity is stable.
Error messages:
- Authentication errors typically indicate invalid or expired API tokens; reconfigure credentials.
- Unsupported operation notices mean the requested feature is not yet implemented or disabled.
- Network or API call failures suggest checking internet connection or API endpoint availability.
Links and References
- 秘塔AI Official Website (for API documentation and account setup)
- Documentation on AI model options and TTS voice capabilities (refer to 秘塔AI developer resources)