comfyui-video-to-text

n8n node to integrate with ComfyUI for video analysis and text extraction (descriptions, subtitles)

Package Information

Downloads: 72 weekly / 72 monthly

Latest Version: 1.0.0

Author: david

Available Nodes

ComfyUI Video to Text

Extract text descriptions or subtitles from videos using ComfyUI workflow

Documentation

n8n-nodes-comfyui-video-to-text

This package provides n8n nodes to integrate with ComfyUI for video analysis and text extraction (descriptions, subtitles).

Features

Execute ComfyUI workflows directly from n8n for video analysis
Extract text descriptions from videos using AI models
Generate subtitles from video content
Support for workflow JSON import
Automatic text output retrieval from workflow results
Progress monitoring and error handling
Support for API key authentication
Configurable timeout settings
Multiple input methods (URL, Base64, Binary)

Prerequisites

n8n (version 1.0.0 or later)
ComfyUI instance running and accessible
Node.js 18 or newer
ComfyUI workflows configured for video analysis (e.g., with video captioning models, OCR nodes, or subtitle extraction models)

Installation

npm install n8n-nodes-comfyui-video-to-text

Node Types

ComfyUI Video to Text Node

This node allows you to analyze videos and extract text descriptions or subtitles using ComfyUI workflows.

Settings

API URL: The URL of your ComfyUI instance (default: http://127.0.0.1:8188)
API Key: Optional API key if authentication is enabled
Workflow JSON: The ComfyUI workflow in JSON format for video analysis
Input Type: Choose between URL, Base64, or Binary input methods
Input Video: URL or base64 string of the input video (when using URL or Base64 input type)
Binary Property: Name of the binary property containing the video (when using Binary input type)
Output Type: Choose between Description, Subtitles, or Both
Timeout: Maximum time in minutes to wait for video analysis

Input

The node accepts a video input in three ways:

URL: Provide a direct URL to a video file
Base64: Provide a base64-encoded video string
Binary: Use a video from a binary property in the workflow (e.g., from an HTTP Request node)

Outputs

The node outputs extracted text:

text: The extracted text content (description, subtitles, or both)
outputType: The type of output requested
textCount: Number of text segments extracted
status: Execution status information

Usage Examples

Using the ComfyUI Video to Text Node

Create a workflow in ComfyUI for video analysis (e.g., using video captioning models, OCR nodes)
Export the workflow as JSON (API format)
Add the ComfyUI Video to Text node to your n8n workflow
Paste your workflow JSON
Select the appropriate Input Type:
- For URL: Enter the video URL
- For Base64: Provide a base64 string
- For Binary: Specify the binary property containing the video (default: "data")
Choose the Output Type (Description, Subtitles, or Both)
Configure timeout as needed
Execute the workflow to extract text from your input video

Example ComfyUI Workflows

The node works with ComfyUI workflows that include:

Video Captioning Models: For generating descriptions
OCR Nodes: For extracting text from video frames
Subtitle Generation Models: For creating subtitles
Video Analysis Models: For content understanding

Make sure your ComfyUI workflow:

Has a LoadVideo or LoadImage node for video input
Produces text outputs through nodes that output strings, captions, descriptions, or subtitles
Is properly configured to handle video files

Error Handling

The node includes comprehensive error handling for:

API connection issues
Invalid workflow JSON
Execution failures
Timeout conditions
Input video validation
Missing text outputs

Development

# Install dependencies
npm install

# Build
npm run build

# Test
npm run test

# Lint
npm run lint

Supported Video Formats

The node supports common video formats that ComfyUI can process:

MP4
AVI
MOV
WebM
And other formats supported by ComfyUI

Requirements for ComfyUI Workflows

Your ComfyUI workflow should:

Include a video input node (LoadVideo or LoadImage)
Process the video through analysis models
Output text results through nodes that produce:
- text outputs
- string outputs
- caption outputs
- description outputs
- subtitles outputs

License

MIT

comfyui-video-to-textInstall