LLM文档转换 icon

LLM文档转换

LLM文档处理节点,将文档转换为大模型友好的格式

Overview

The node "LLM文档转换" (LLM Document Conversion) provides functionality to convert various document file types into Markdown format using a large language model (LLM) based processing. The specific operation "自动识别" (Auto Recognition) automatically detects the type of the input document and converts it into Markdown text.

This node is useful in scenarios where users have documents in formats like PDF, Word, Excel, PowerPoint, CSV, HTML, XML, or JSON and want to quickly transform them into clean, readable Markdown for documentation, note-taking, publishing, or further processing.

Practical examples:

  • Automatically converting meeting notes saved as DOCX files into Markdown for easy editing.
  • Transforming PDF reports into Markdown to integrate with static site generators.
  • Converting CSV data exports into Markdown tables for inclusion in documentation.

Properties

Name Meaning
文件字段名 The name of the input file field containing the document to convert. Supported formats include pdf, doc, docx, ppt, pptx, xlsx, html, csv, etc.
返回Markdown文本 Whether to return the converted Markdown text content. If disabled, only the URL of the converted document is returned. Options: true (return Markdown), false (return URL only).

Output

The node outputs JSON data containing the result of the conversion. When "返回Markdown文本" is enabled, the output includes the Markdown text representation of the input document. If disabled, the output contains a URL pointing to the converted document instead.

If the node supports binary data (not explicitly shown here), it would typically represent the converted document file in binary form, but this is not detailed in the provided code.

Dependencies

  • Requires an API key credential for accessing the LLM document conversion service.
  • The node uses a base URL configured via credentials to send requests to the external LLM-based document processing API.
  • Supported document formats depend on the backend service capabilities (pdf, doc, docx, ppt, pptx, xlsx, html, csv, xml, json).

Troubleshooting

  • Common issues may include unsupported file formats or corrupted input files leading to failed conversions.
  • Errors related to authentication failures indicate missing or invalid API credentials.
  • Network errors or timeouts can occur if the external API service is unreachable.
  • If the output is empty or missing Markdown content, verify that the "返回Markdown文本" property is enabled.
  • Ensure the input file field name matches the actual input data property to avoid "file not found" errors.

Links and References

  • No direct links are provided in the source code. For more information, consult the documentation of the external LLM document conversion API used by this node.

Discussion