mineru

Free and comprehensive document parsing capabilities

Package Information

Released: 9/23/2025
Downloads: 13 weekly / 96 monthly
Latest Version: 0.1.7
Author: opendatalab

Documentation

n8n-nodes-mineru

MinerU Logo

📖 Introduction

n8n-nodes-mineru is a powerful n8n community node package that integrates the MinerU document parsing API, providing you with free and comprehensive document parsing capabilities. It supports intelligent parsing of various formats including PDF, Word, PowerPoint, images, and can automatically recognize text, tables, formulas, and image content.

✨ Key Features

  • 🚀 Multi-format Support: Supports PDF, DOC, DOCX, PPT, PPTX, PNG, JPG, JPEG and other formats
  • 🧠 Intelligent Recognition: Automatically recognizes text, tables, formulas, and images in documents
  • 🌐 Dual Service Modes: Supports both online API service and local self-deployed service
  • 📊 Multiple Output Formats: Supports Markdown, JSON, DOCX, HTML, LaTeX and other output formats
  • 🔧 Flexible Configuration: Provides rich parameter configuration options to meet different scenario requirements
  • 🌍 Multi-language Support: Supports Chinese, English, and automatic language detection

📦 Included Nodes

1. MinerU Node

  • Function: Uses MinerU online API service to parse documents
  • Features: Automatically creates tasks and waits for results, returns parsed ZIP files
  • Use Case: Users who need to use the official API service

2. MinerU Custom Service Node

  • Function: Connects to self-deployed MinerU API server
  • Features: Supports local file upload with more custom configuration options
  • Use Case: Users with self-deployment needs or requiring more control

🛠️ Installation

Method 1: Install via n8n Community Nodes

  1. Open n8n interface
  2. Go to Settings > Community Nodes
  3. Click Install Community Node
  4. Enter package name: n8n-nodes-mineru
  5. Click Install

Method 2: Install via npm

# Execute in n8n root directory
npm install n8n-nodes-mineru

Method 3: Manual Installation

# Clone repository
git clone https://github.com/opendatalab/awsome-mineru.git
cd awsome-mineru/n8n-nodes-mineru

# Install dependencies
npm install

# Build project
npm run build

# Link to n8n (for development environment)
npm link

🔑 Credential Configuration

MinerU API Credentials

  1. Create new credentials in n8n
  2. Select MinerU API type
  3. Enter your API Token
  4. Save credentials

Get API Token:

📋 Usage Guide

MinerU Node Usage

  1. Add Node: Add "MinerU" node to your workflow
  2. Configure Credentials: Select the created MinerU API credentials
  3. Set Parameters:
    • Document URL: Link to the document to be parsed (required)
    • Enable OCR: Whether to enable image text recognition
    • Enable Formula Recognition: Whether to recognize mathematical formulas
    • Enable Table Recognition: Whether to recognize table structures
    • Document Language: Select the main language of the document
    • Extra Export Format: Select additional output formats needed
    • Model Version: Select the MinerU model version to use
  4. Execute Node: The node will automatically create parsing task and wait for completion
  5. Get Results: Returns ZIP file containing all results after parsing completion

MinerU Custom Service Node Usage

  1. Deploy Service: First need to deploy MinerU API server
  2. Add Node: Add "MinerU Custom Service" node to your workflow
  3. Configure Parameters:
    • API Version: Select V1 or V2
    • File URL: Link to the document to be parsed
    • API Server Address: Your MinerU server address
    • Output Directory: Output directory for parsing results
    • Configure corresponding parameters based on selected API version
  4. Execute Node: Directly call your server for parsing

🔧 Parameter Description

Common Parameters

Parameter Type Default Description
Document URL String - URL address of the document to be parsed
Enable OCR Boolean false Whether to enable optical character recognition
Enable Formula Recognition Boolean true Whether to recognize mathematical formulas
Enable Table Recognition Boolean true Whether to recognize table structures
Document Language Option Chinese Main language of the document

MinerU Node Specific Parameters

Parameter Type Default Description
Data ID String - Optional data identifier
Page Range String - Specify the page range to parse
Extra Export Format Multi-select [] Additional output formats besides default
Polling Interval Number 5 Interval time to check task status (seconds)
Maximum Wait Time Number 10 Maximum time to wait for task completion (minutes)

Custom Service Node Specific Parameters

Parameter Type Default Description
API Server Address String http://localhost:8000 MinerU server address
Output Directory String ./output Output directory for parsing results
Backend Engine Option pipeline Processing engine type
Return Markdown Boolean true Whether to return Markdown format results

🌟 Usage Examples

Example 1: Parse PDF Document and Extract Text

{
  "nodes": [
    {
      "name": "MinerU",
      "type": "n8n-nodes-mineru.mineru",
      "parameters": {
        "url": "https://example.com/document.pdf",
        "isOcr": true,
        "enableFormula": true,
        "enableTable": true,
        "language": "ch",
        "extraFormats": ["docx", "html"]
      }
    }
  ]
}

Example 2: Parse Multiple Format Documents Using Custom Service

{
  "nodes": [
    {
      "name": "MinerU Custom Service",
      "type": "n8n-nodes-mineru.mineruCustom",
      "parameters": {
        "apiVersion": "v2",
        "fileUrl": "https://example.com/presentation.pptx",
        "serverUrl": "http://your-mineru-server:8000",
        "langList": "auto",
        "formulaEnable": true,
        "tableEnable": true,
        "returnMd": true
      }
    }
  ]
}

🚀 Advanced Usage

Batch Document Processing

You can combine with other n8n nodes to implement batch document processing:

  1. Use HTTP Request node to get document list
  2. Use Split In Batches node to process in batches
  3. Use MinerU node to parse each document
  4. Use Merge node to combine results

Result Post-processing

After parsing completion, you can:

  1. Use Move Binary Data node to process returned files
  2. Use HTTP Request node to upload results to cloud storage
  3. Use Email node to send parsing results
  4. Use Webhook node to trigger subsequent processes

🔍 Troubleshooting

Common Issues

Q: Node execution fails with "API Token verification failed"
A: Please check if your API Token is correct and ensure you have obtained a valid Token from the MinerU official website.

Q: Document parsing timeout
A: You can appropriately increase the "Maximum Wait Time" parameter or check if the document size is too large.

Q: Custom service connection failed
A: Please ensure your MinerU server is running normally and the network connection is stable.

Q: Some document formats cannot be parsed
A: Please confirm the document format is in the supported list and check if the document is corrupted.

Debugging Tips

  1. Enable Node Debug: Enable "Continue On Fail" option in node settings
  2. Check Error Logs: Check n8n error logs for detailed information
  3. Test Connection: Use simple documents to test if connection is normal first
  4. Check Parameters: Ensure all required parameters are set correctly

🤝 Contributing

We welcome community contributions! If you want to contribute to the project:

  1. Fork this repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE.md file for details.

🔗 Related Links

👥 Contact Us


If this project helps you, please give us a ⭐️!

Discussion