comfyui-video-to-text

n8n node to integrate with ComfyUI for video analysis and text extraction (descriptions, subtitles)

Package Information

Released: 9/1/2025
Downloads: 72 weekly / 72 monthly
Latest Version: 1.0.0
Author: daixudk2

Documentation

n8n-nodes-comfyui-video-to-text

This package provides n8n nodes to integrate with ComfyUI for video analysis and text extraction (descriptions, subtitles).

Features

  • Execute ComfyUI workflows directly from n8n for video analysis
  • Extract text descriptions from videos using AI models
  • Generate subtitles from video content
  • Support for workflow JSON import
  • Automatic text output retrieval from workflow results
  • Progress monitoring and error handling
  • Support for API key authentication
  • Configurable timeout settings
  • Multiple input methods (URL, Base64, Binary)

Prerequisites

  • n8n (version 1.0.0 or later)
  • ComfyUI instance running and accessible
  • Node.js 18 or newer
  • ComfyUI workflows configured for video analysis (e.g., with video captioning models, OCR nodes, or subtitle extraction models)

Installation

npm install n8n-nodes-comfyui-video-to-text

Node Types

ComfyUI Video to Text Node

This node allows you to analyze videos and extract text descriptions or subtitles using ComfyUI workflows.

Settings

  • API URL: The URL of your ComfyUI instance (default: http://127.0.0.1:8188)
  • API Key: Optional API key if authentication is enabled
  • Workflow JSON: The ComfyUI workflow in JSON format for video analysis
  • Input Type: Choose between URL, Base64, or Binary input methods
  • Input Video: URL or base64 string of the input video (when using URL or Base64 input type)
  • Binary Property: Name of the binary property containing the video (when using Binary input type)
  • Output Type: Choose between Description, Subtitles, or Both
  • Timeout: Maximum time in minutes to wait for video analysis

Input

The node accepts a video input in three ways:

  1. URL: Provide a direct URL to a video file
  2. Base64: Provide a base64-encoded video string
  3. Binary: Use a video from a binary property in the workflow (e.g., from an HTTP Request node)

Outputs

The node outputs extracted text:

  • text: The extracted text content (description, subtitles, or both)
  • outputType: The type of output requested
  • textCount: Number of text segments extracted
  • status: Execution status information

Usage Examples

Using the ComfyUI Video to Text Node

  1. Create a workflow in ComfyUI for video analysis (e.g., using video captioning models, OCR nodes)
  2. Export the workflow as JSON (API format)
  3. Add the ComfyUI Video to Text node to your n8n workflow
  4. Paste your workflow JSON
  5. Select the appropriate Input Type:
    • For URL: Enter the video URL
    • For Base64: Provide a base64 string
    • For Binary: Specify the binary property containing the video (default: "data")
  6. Choose the Output Type (Description, Subtitles, or Both)
  7. Configure timeout as needed
  8. Execute the workflow to extract text from your input video

Example ComfyUI Workflows

The node works with ComfyUI workflows that include:

  • Video Captioning Models: For generating descriptions
  • OCR Nodes: For extracting text from video frames
  • Subtitle Generation Models: For creating subtitles
  • Video Analysis Models: For content understanding

Make sure your ComfyUI workflow:

  • Has a LoadVideo or LoadImage node for video input
  • Produces text outputs through nodes that output strings, captions, descriptions, or subtitles
  • Is properly configured to handle video files

Error Handling

The node includes comprehensive error handling for:

  • API connection issues
  • Invalid workflow JSON
  • Execution failures
  • Timeout conditions
  • Input video validation
  • Missing text outputs

Development

# Install dependencies
npm install

# Build
npm run build

# Test
npm run test

# Lint
npm run lint

Supported Video Formats

The node supports common video formats that ComfyUI can process:

  • MP4
  • AVI
  • MOV
  • WebM
  • And other formats supported by ComfyUI

Requirements for ComfyUI Workflows

Your ComfyUI workflow should:

  1. Include a video input node (LoadVideo or LoadImage)
  2. Process the video through analysis models
  3. Output text results through nodes that produce:
    • text outputs
    • string outputs
    • caption outputs
    • description outputs
    • subtitles outputs

License

MIT

Discussion