YouTube Transcript With Language V2

Fetch YouTube subtitles with language and cookie support using yt-dlp

Overview

This node fetches YouTube video subtitles (transcripts) with support for specifying the language and handling restricted videos via cookie-based authentication. It uses the external tool yt-dlp to download subtitles and optionally video metadata. Users can choose to prefer manual subtitles over auto-generated ones, select the output format of the transcript, and include video metadata in the output.

Common scenarios where this node is beneficial:

  • Extracting subtitles from public or restricted YouTube videos for translation, analysis, or accessibility purposes.
  • Archiving transcripts along with video metadata for content management.
  • Processing subtitles in different languages by specifying language codes.
  • Handling videos that require authentication cookies to access subtitles.

Practical example:

  • A user wants to get English manual subtitles from a private YouTube video they have access to, providing their browser cookies for authentication, and receive both structured subtitle data with timestamps and plain text concatenation for further processing.

Properties

Name Meaning
Video ID/URL The YouTube video ID or full URL to extract subtitles from.
Language Language code for the transcript (e.g., en, vi, fr, es).
Prefer Manual Subtitles Whether to prefer manually created subtitles over auto-generated ones.
Output Format Format of the transcript output: "Structured" (array with timestamps), "Plain Text" (concatenated text), or "Both".
Include Metadata Whether to include video metadata such as title, duration, uploader, upload date, view count, description, thumbnail, tags, and categories.
Binary Path Path to the yt-dlp binary executable. Use "yt-dlp" if installed globally.
Authentication Method Method to authenticate for restricted videos: None, Cookie String, or Cookie File.
Cookie String Cookie string exported from browser for authentication (shown only if "Cookie String" auth method selected).
Cookie File Path Absolute path to a cookie file for authentication (shown only if "Cookie File" auth method selected).

Output

The node outputs an array of items, each containing a JSON object with the following structure:

  • youtubeId: The extracted YouTube video ID.
  • videoUrl: Normalized full YouTube video URL.
  • language: Language code of the transcript.
  • subtitleType: Either "manual" or "auto-generated" depending on subtitle preference and availability.
  • transcriptLength: Number of transcript entries.
  • transcript: (optional) Array of transcript items when output format includes structured data. Each item has:
    • text: Subtitle text.
    • start: Start time in seconds (float).
    • duration: Duration in seconds (float).
  • transcriptText: (optional) Plain text concatenation of all subtitle texts when output format includes plain text.
  • metadata: (optional) Object containing video metadata fields like title, duration, uploader, uploadDate, viewCount, description, thumbnail, tags, and categories.

No binary data output is produced by this node.

Dependencies

  • Requires the external command-line tool yt-dlp to be installed and accessible via the specified binary path.
  • For restricted videos, requires either a valid cookie string or cookie file exported from a browser session.
  • Node environment must allow execution of child processes and temporary file creation.
  • No internal n8n credentials are required beyond user-provided cookie data if needed.

Troubleshooting

  • Error: "yt-dlp binary not found or failed to run"
    Ensure that yt-dlp is installed and the binary path parameter points correctly to the executable. If installed globally, simply use "yt-dlp".

  • Error: "The video ID/URL parameter is empty."
    Provide a valid YouTube video ID or URL in the input property.

  • Error: "No transcript found for this video with language ..."
    The requested language subtitles are not available. Check available languages or try a different language code.

  • Error: "Failed to fetch video metadata" or "Failed to download transcript"
    Could indicate network issues, invalid cookies, or video restrictions. Verify authentication method and cookie validity.

  • Temporary cookie files created for cookie string authentication are cleaned up automatically; however, permission issues in the temp directory may cause failures.

  • If using cookie file authentication, ensure the path is absolute and the file is accessible.

Links and References

Discussion