html-toon

n8n community node to convert HTML tables and lists to TOON format

Package Information

Downloads: 32 weeklyย /ย 49 monthly
Latest Version: 1.0.0
Author: Rexia Intel Automation

Documentation

n8n Community Node: HTML to TOON Converter

A powerful n8n community node that extracts structured data from HTML tables and lists, converting it to the TOON format - a compact, AI-friendly text representation perfect for data pipelines and LLM prompts.

n8n Community Node
License: ISC
TypeScript

โœจ Features

  • HTML Table Extraction: Extract tabular data from HTML tables with nested table support
  • HTML List Extraction: Extract structured data from ordered/unordered lists
  • TOON Format Conversion: Convert extracted data to compact TOON format
  • Multiple Output Formats: Output as TOON string, JSON object, or both
  • Flexible Input Sources: HTML string, URL fetching, or previous node data
  • AI-Optimized: TOON format is perfect for LLM prompts and AI processing
  • Error Handling: Robust error handling with detailed error messages
  • Performance Optimized: Memory-efficient processing with performance tracking

๐Ÿ“ฆ Installation

Option 1: Install in n8n (Recommended)

  1. In your n8n instance, go to Settings โ†’ Community Nodes
  2. Click Install community node
  3. Enter the package name: n8n-nodes-html-toon
  4. Click Install
  5. Restart n8n

Option 2: Manual Installation

# Clone the repository
git clone https://github.com/rexia-intel-automation/n8n-nodes-html-toon.git

# Install dependencies
cd n8n-nodes-html-toon
npm install

# Build the node
npm run build

# Copy the dist folder to your n8n custom nodes directory
cp -r dist /path/to/n8n/custom-nodes/

๐Ÿš€ Quick Start

Basic Usage

  1. Add the "HTML to TOON" node to your workflow
  2. Configure the node:
    • Source Type: Choose where to get HTML from
    • HTML/URL: Provide HTML string or URL
    • Extraction Type: Tables, Lists, or Both
    • Output Format: TOON string, JSON, or Both
    • Output Name: Name for your data (e.g., "users", "products")

Example Workflows

Example 1: Extract Product Table from Website

[HTTP Request] โ†’ [HTML to TOON] โ†’ [AI Chat Model]

Configuration:

  • Source Type: URL
  • URL: https://example.com/products
  • Extraction Type: Tables
  • Output Format: TOON String
  • Output Name: products

Example 2: Process HTML from Previous Node

[Webhook] โ†’ [HTML to TOON] โ†’ [Database] โ†’ [Email]

Configuration:

  • Source Type: Previous Node Data
  • Extraction Type: Both (Tables and Lists)
  • Output Format: Both (TOON and JSON)
  • Output Name: extractedData

๐Ÿ“– TOON Format Explained

TOON (Table Object Object Notation) is a compact text format designed for AI consumption:

Basic Syntax

name[count]{fields}: value1, value2, ...; valueN

Examples

Simple Array:

users[2]{id, name, role}: 1, Alice, admin; 2, Bob, user

Nested Objects:

user{id, name, address{street, city}}: 1, Alice, Main St, NYC

Mixed Array:

products[3]{name, price, stock}: "Laptop", "$999", 10; "Mouse", "$49", 50; "Keyboard", "$79", 25

Key Features

  • Compact: Minimal syntax overhead
  • Readable: Human and AI readable
  • Structured: Preserves data relationships
  • Escaped: Handles special characters automatically
  • Consistent: Predictable output format

๐Ÿ”ง Node Configuration

Input Parameters

Parameter Type Required Description
Source Type Options Yes url, html, or input
URL String Conditional Required if source type is url
HTML String Conditional Required if source type is html
Extraction Type Options Yes tables, lists, or both
Output Format Options Yes toon, json, or both
Output Name String Yes Name for the output data
TOON Options Collection No TOON formatting options
Extraction Options Collection No HTML extraction options

TOON Options

Option Type Default Description
Null Symbol String โˆ… Symbol for null/undefined values
Indent Size Number 2 Indentation for nested structures
Include Count Boolean true Include [N] count in output

Extraction Options

Option Type Default Description
Include Attributes Boolean false Include HTML attributes in output
Remove Empty Boolean true Remove empty rows/columns
Max Depth Number 3 Maximum depth for nested tables/lists
Infer Headers Boolean true Use first row as headers if no <th>

๐Ÿ“Š Output Examples

TOON Output

{
  "toon": "products[2]{name, price}: \"Laptop\", \"$999\"; \"Mouse\", \"$49\"",
  "metadata": {
    "count": 2,
    "type": "array",
    "sourceType": "html",
    "extractionType": "tables",
    "performance": {
      "total": 45,
      "extraction": 15,
      "conversion": 5
    }
  }
}

JSON Output

{
  "json": [
    { "name": "Laptop", "price": "$999" },
    { "name": "Mouse", "price": "$49" }
  ],
  "metadata": { ... }
}

Both Outputs

{
  "toon": "users[1]{id, name}: 1, Alice",
  "json": { "id": 1, "name": "Alice" },
  "parsed": { "users": [{ "id": 1, "name": "Alice" }] },
  "metadata": { ... }
}

๐Ÿ”„ Advanced Usage

Processing Large HTML Files

For large HTML documents, use extraction options to optimize performance:

{
  "extractionOptions": {
    "removeEmpty": true,
    "maxDepth": 2,
    "includeAttributes": false
  }
}

AI Pipeline Integration

Use TOON format directly in AI prompts:

# Example Python with OpenAI
prompt = f"""
Extract insights from this product data:
{toon_string}

What are the top 3 most expensive products?
"""

Error Handling

The node includes comprehensive error handling:

{
  "error": true,
  "message": "Failed to fetch URL: Network error",
  "itemIndex": 0,
  "timestamp": "2024-01-01T12:00:00Z"
}

๐Ÿงช Testing

Run the test suite:

# Install dependencies
npm install

# Run tests
npm test

# Run tests in watch mode
npm run test:watch

# Run specific test
npm test -- test/converters.test.ts

Test Coverage

npm test -- --coverage

๐Ÿ—๏ธ Development

Project Structure

n8n-nodes-html-toon/
โ”œโ”€โ”€ nodes/                    # n8n node implementation
โ”‚   โ””โ”€โ”€ HtmlToonNode/
โ”‚       โ”œโ”€โ”€ HtmlToonNode.ts  # Main node class
โ”‚       โ”œโ”€โ”€ HtmlToonNode.json # Node configuration
โ”‚       โ””โ”€โ”€ index.ts         # Node export
โ”œโ”€โ”€ src/                     # Core library
โ”‚   โ”œโ”€โ”€ converters/         # TOON conversion logic
โ”‚   โ”œโ”€โ”€ extractors/        # HTML extraction logic
โ”‚   โ””โ”€โ”€ utils/             # Utilities
โ”œโ”€โ”€ test/                   # Test files
โ”œโ”€โ”€ examples/              # Example workflows
โ””โ”€โ”€ dist/                  # Built files

Building the Node

# Install dependencies
npm install

# Build TypeScript
npm run build

# Development watch mode
npm run dev

Adding New Features

  1. Implement feature in src/
  2. Add tests in test/
  3. Update node in nodes/HtmlToonNode/
  4. Build and test
  5. Update documentation

๐Ÿค Contributing

We welcome contributions! Here's how:

  1. Fork the repository
  2. Create a branch: git checkout -b feature/amazing-feature
  3. Make your changes
  4. Run tests: npm test
  5. Commit: git commit -m 'Add amazing feature'
  6. Push: git push origin feature/amazing-feature
  7. Open a Pull Request

Contribution Guidelines

  • Follow TypeScript best practices
  • Add tests for new features
  • Update documentation
  • Use descriptive commit messages
  • Keep code consistent with existing style

๐Ÿ“„ License

This project is licensed under the ISC License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • n8n for the amazing automation platform
  • Cheerio for fast HTML parsing
  • Axios for reliable HTTP requests
  • All contributors and users of this node

๐Ÿ“ž Support


Made with โค๏ธ by Rexia Intel Automation

If you find this node useful, please give it a star on GitHub! โญ

Discussion