SiliconFlow Chat Model

For advanced usage with AI Agent chains - supports function calling

Overview

This node integrates with SiliconFlow's AI chat models to generate text completions and support advanced AI agent or chain workflows. It is designed for scenarios where users want to leverage powerful language models that support function calling and reasoning capabilities, such as generating conversational responses, performing complex reasoning tasks, or integrating AI tools within automated workflows.

Practical examples include:

  • Creating chatbots that can call external functions or APIs dynamically.
  • Automating content generation with control over randomness and token limits.
  • Implementing AI agents that perform multi-step reasoning or chain-of-thought processes.

Properties

Name Meaning
Connect to AI Agent or AI Chain to use this node. SiliconFlow models support function calling and reasoning capabilities. Informational notice guiding the user to connect this node to an AI agent or chain for advanced usage.
Model The specific SiliconFlow model used to generate completions. All models support tool calling. Options include:
- GLM-4-Plus (recommended)
- GLM-4-0520
- GLM-4-AirX
- GLM-4-Air
- GLM-4-Flash
- GLM-4-AllTools
- Qwen2.5-72B-Instruct
- Qwen2.5-32B-Instruct
- Qwen2.5-14B-Instruct
- Qwen2.5-7B-Instruct
- DeepSeek-V2.5
- QwQ-32B (inference model)
- DeepSeek-R1 (inference model)
Options Additional options to customize the completion request:
- Maximum Number of Tokens Maximum tokens to generate in the completion (1 to 16384). Default: 1024.
- Sampling Temperature Controls randomness of output (0 to 2). Lower values make output more deterministic. Default: 0.7.
- Top P Nucleus sampling parameter controlling diversity (0 to 1). Default: 0.7.
- Top K Limits number of tokens considered at each step (1 to 100). Default: 50.
- Frequency Penalty Penalizes new tokens based on their frequency (-2 to 2). Default: 0.
- Timeout Maximum time allowed for the API request in milliseconds. Default: 60000 (60 seconds).
- Max Retries Maximum number of retry attempts on failure. Default: 2.
- Enable Thinking (推理模型) Enables chain-of-thought reasoning for supported inference models. Default: false.
- Thinking Budget Maximum tokens allocated for the reasoning process (128 to 32768). Only shown if "Enable Thinking" is true. Default: 4096.

Output

The node outputs a JSON object containing the generated completion and additional metadata:

  • content: The main text response generated by the model.
  • additional_kwargs: An object that may include:
    • tool_calls: Details about any function/tool calls made by the model during generation.
    • reasoning: Reasoning content if chain-of-thought reasoning was enabled and used.
  • response_metadata: Metadata including the model name and usage statistics (e.g., token counts).

If the model performs tool calling or reasoning, these are included in the output under additional_kwargs.

The node does not output binary data.

Dependencies

  • Requires an API key credential for SiliconFlow's API service.
  • Needs network access to the configured SiliconFlow base URL.
  • Uses the Axios HTTP client internally to communicate with the SiliconFlow API.
  • Node configuration must include valid credentials with apiKey and baseUrl.

Troubleshooting

  • Invalid input format error: The node expects input as a string, an array of messages, or an object with a messages or content property. Providing unsupported formats will cause errors.
  • No response received from SiliconFlow: Indicates the API returned no usable data. Check API key validity, model selection, and network connectivity.
  • SiliconFlow API error: Generic error wrapper for issues communicating with the API. Could be due to invalid credentials, exceeding rate limits, or server errors. Verify credentials and retry settings.
  • Timeouts: If requests take longer than the configured timeout, increase the timeout value or check network conditions.
  • Retries exhausted: If max retries are reached without success, inspect logs for underlying causes like intermittent network failures or API throttling.

Links and References

Discussion