Convert any Document to Text icon

Convert any Document to Text

Convert any document to text.

Overview

This node converts any document accessible via a URL into plain text. It is useful when you need to extract textual content from various document formats (e.g., PDFs, Word documents) without manually opening or processing the files. Typical use cases include automating data extraction workflows, preparing documents for text analysis, or integrating document content into other systems.

For example, you can provide a URL pointing to a PDF file and the node will return the extracted text content, enabling further processing such as sentiment analysis or keyword extraction.

Properties

Name Meaning
File URL The publicly accessible URL of the document you want to convert to text.
File Name The name of the file being converted, used to identify the document in the conversion.

Output

The node outputs JSON data containing the extracted text from the provided document. The exact structure is not detailed in the source code, but typically it includes fields with the textual content parsed from the document.

No binary data output is indicated by the source code.

Dependencies

  • Requires an API key credential and a user identifier credential to authenticate requests.
  • Connects to an external API endpoint at https://api.mona-ai.cloud/parsing/anyDocumentToText to perform the document-to-text conversion.
  • The node expects the input document to be accessible via a public URL.

Troubleshooting

  • Common issues:

    • Invalid or inaccessible file URL: Ensure the URL is correct and publicly reachable.
    • Missing or incorrect API credentials: Verify that the required API key and user ID are properly configured.
    • Unsupported file format or corrupted document: The external service may fail to parse certain document types or damaged files.
  • Error messages:

    • Authentication errors likely indicate invalid or missing credentials.
    • HTTP errors related to the URL usually mean the file cannot be accessed or found.
    • Parsing errors suggest the document format is unsupported or the file is corrupted.

Resolving these involves checking credentials, verifying the file URL accessibility, and ensuring the document is valid and supported.

Links and References

Discussion