Giter Club home page Giter Club logo

youtube-vtt's Introduction

Please note that open-source maintenance is not my main focus at the moment.

I will not be investing significant effort in the very near future to review and address issues on this repository. However I do want my software to be useable!

If you have an issue that must be resolved for your work, please open a pull request to fix it, and send me a direct email to make sure that I see it. I ignore most messages from GitHub these days.

I'm also happy to help out if you have a question about how to use the library.

My email can be found at the top of this commit.

Keep in mind that I have a full-time job and a personal life as well as other hobbies that have taken priority over open source, so I might not respond immediately. But don't hesitate to follow up after a few days if you think I've missed your email.

youtube-vtt

Extract and save WebVTT (.vtt) closed caption files from YouTube videos.

YouTube videos don't use a standard closed caption format so this script parses that format and converts it into the WebVTT format. The exported caption files can be used to display native captions in any browser supporting the HTML video element.

How to use

Simple usage (visiting page in browser)

  1. Open a YouTube video with closed captions in a web browser

  2. Open the JavaScript console (in Chrome this is Ctrl+Shift+J/Cmd+Option+J/).

  3. Paste the contents of save-vtt-files.js and hit Enter.

  4. Run the command to export and save a .vtt file for each caption track:

    a. To export with default settings, just run:

    saveVttFiles();

    b. By default we're making only one caption display at a time, but YouTube saves the captions in overlapping (two-at-a-time) fashion, which makes sense for the way YouTube shows captions. If you'd like to preserve the overlapping durations, you can run this instead:

    AVOID_CONCURRENT_CAPTIONS = false;
    saveVttFiles();

    c. If you'd like your captions to be auto-translated into a different language by YouTube, you can specify the language code as an option:

    saveVttFiles({ translationLanguageCode: 'zh-Hans' });
  5. For each caption track, a file will be saved called [Video Title]-[Language Code].vtt.

Command line usage

Alternatively, you can use a CLI which allows you to trigger downloads in a more automated fashion.

Installation

npm install -g youtube-vtt

You also must have the Google Chrome browser (not Chromium) installed on your system or the commands below will fail.

Examples

Download captions for a video (vtt files go into a downloads directory under the current working path):

youtube-vtt https://www.youtube.com/watch?v=XXXXXXXXXXX

Translate downloaded captions to Simplified Chinese:

youtube-vtt https://www.youtube.com/watch?v=XXXXXXXXXXX --translation zh-Hans

Allow concurrent timespans for captions (disabled by default):

youtube-vtt https://www.youtube.com/watch?v=XXXXXXXXXXX --concurrent

Wait only one second for downloads to complete (default wait time is 5 seconds):

youtube-vtt https://www.youtube.com/watch?v=XXXXXXXXXXX --wait 1000

Wait 15 seconds for downloads to complete (if you have a slow connection):

youtube-vtt https://www.youtube.com/watch?v=XXXXXXXXXXX --wait 15000

Run in debug mode. The browser will open a window rather than running in the background, and it will wait for you to close it manually, allowing you to interact with the browser and inspect the page.

youtube-vtt https://www.youtube.com/watch?v=XXXXXXXXXXX --debug

FAQ

Can I use this to get captions from a YouTube live stream?

If the live stream will eventually complete (e.g. a live stream of an event that lasts for a few hours), and you're able to wait, then the commands above will work for that recording once it has completed.

However, if you need to get captions from a live stream in progress, you can't yet use youtube-vtt or saveVttFiles(). I do have experimental code that will work to get captions for a live stream, which can be found here. I'm still trying to figure out the best way to integrate this with the main code base, so let me know if you have any suggestions. I'm still wondering what the main use case for consuming captions from a live stream would be.

youtube-vtt's People

Contributors

benwiley4000 avatar waxpancake avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

youtube-vtt's Issues

Unexpected JSON token error message

Hello, @benwiley4000,

Following the simple usage and command line usage instructions in your README, I am still unable to successfully download a closed caption file. In both the Chrome JavaScript console and my CLI, I am receiving the same error:

Unexpected token u in JSON at position 0

As an example, this is the YouTube video I was working with, although I got the same error with other videos: A disfigured Randy Orton sends fiery message to The Fiend: Raw, Jan. 18, 2021

I am using Windows Terminal and have installed Node.js v14.15.4. Could I be missing other necessary packages, or is there another issue?

Thank you

Use case for live stream

If you are looking for a use case for live streaming, it is the ability to provide real-time analytics against what is being said. One of our projects entails providing analysis on what politicians are saying in real-time, this project current relies on saving the stream to AWS and using Transcribe to extract the content for analysis, this introduces a delay between what is said and the analysis. Your approach is much faster and provides real benefit.

ytplayer? (issues with live stream)

Hey there, thanks for releasing this!

I followed your instructions but got this error:

VM26770:4 Uncaught ReferenceError: ytplayer is not defined

Where does this get defined?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.