DOC to Text

Convert DOC files to text using mammoth

Overview

This node converts DOC files to plain text using the Mammoth library. It is useful for extracting readable text content from DOC files in workflows, such as processing document uploads, automating text extraction for analysis, or integrating document content into other systems.

Use Case Examples

  1. Extract text from uploaded DOC files to store in a database.
  2. Convert DOC files to text for further natural language processing.
  3. Automate reading and processing of DOC documents in a workflow.

Properties

Name Meaning
Binary Field Name Specifies the name of the binary field that contains the DOC file to be converted.

Output

JSON

  • text - The extracted plain text content from the DOC file.

Dependencies

  • Mammoth library for DOC to text conversion

Troubleshooting

Error 'No binary data found for field ""' indicates the specified binary field does not contain a file. Ensure the input data includes a binary file under the given field name.

  • Warnings from Mammoth during text extraction are logged to the console; these usually do not stop processing but indicate potential issues with the DOC file format or content.

Links

Discussion