Giter Club home page Giter Club logo

twitchvault's Introduction

TwitchVault

Simplified tool to automatically archive VODs, clips, highlights, including associated chat logs for specified Twitch channels


Concept

My personal goal has been to find or develop a tool that can not only automate archiving the latest VODs, clips, and highlights from selected Twitch channels, but also archiving the chat logs as they are available for each medium.

goldbattle's Twitch VOD Creator had everything needed to accomplish this with just a little bit of unneeded cruft on top. This repo has been heavily modified from that and slimmed down to the core functions: Automatically download available VODs, highlights, uploads, and clips for selected channels. Additionally the added ability to download chat logs for each video, optionally rendering the chat to video through TwitchDownloaderCLI's render function. As a bonus, Speech-to-Text exists from the original code via the Vosk speech recognition API. This has been maintained with this fork as an optional tool.


Install Guide

  1. Ensure Python 3.6 or higher is installed.
  2. Clone this repository:
    • git clone https://github.com/cr08/TwitchVault
  3. Install main python dependencies:
    • python3 -m pip install --user -r requirements.txt
  4. Download and place TwitchDownloaderCLI for your platform into /thirdparty
    • Latest release recommended, minimum 1.50.7 required. Code has been updated to use new mode syntax introduced in this version.
    • Ensure TwitchDownloaderCLI is set as executable. This may be necessary on *nix platforms
      • chmod +x thirdparty/TwitchDownloaderCLI
      • If on a stripped distro such as an LXC container, ensure to install the available libicuXX package available as it appears to be required by TDCLI.
  5. Copy and fill out all config/*.yaml.example files as necessary.
    • An application needs to be registered with Twitch from the Twitch Dev console - client ID and secret need to be entered into config/config.yaml
  6. Run scripts as desired:
    • python3 videos.py
    • python3 clips.py

Optional tasks

  • Linux targets: Add scripts to crontab using docs/crontab_script_launcher.sh and ensure to mark the launcher script executable
    • chmod +x ./docs/crontab_script_launcher.sh
    • sudo crontab -e
    • */25 * * * * /path/to/repo/docs/crontab_script_launcher.sh videos.py
      * */12 * * * /path/to/repo/docs/crontab_script_launcher.sh clips.py
      
  • For SRT transcription: Download Vosk Speech Recognition model and extract to ./thirdparty/, pointing path_model variable in videos.py to this folder.
    • vosk-model-small-en-us-<ver> highly recommended for English speaking content and is referenced by default. Low resource usage and decent accuracy. Full size models in testing have high resource usage requirements (6GB+ free RAM and a high end CPU). These can be used at your own discretion but support will be limited here.

Known Issues

  • None at this time...

Credit & Attribution

This repo has been heavily modified from goldbattle's Twitch VOD Creator - All credit and attribution as well as a huge amount of thanks goes out to them for creating the core functionality of automatically retrieving the requisite content from Twitch.

twitchvault's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

gaurav-grabbit

twitchvault's Issues

Adding optional standalone Vosk transcription tool

Reverting an old deletion and bringing back the 0_main_vtt_generation.py file. Intent here is to keep an optional standalone tool to transcript audio to SRT from random video files not necessarily downloaded by the main tools here.

At the time of this writing, the old file has been added in but is unchanged so is unlikely to work at this point in time. More work to come on that front...

Migrate from writing placeholder _chat.json.BAD files on failed chat downloads to log file to clean up download folders

As a workaround for the time being if a chat log is not available (due to either nothing being done in chat for that video or the source VOD is old/unavailable), we write a placeholder _chat.json.BAD file to satisfy future file checks and not attempt downloading again.

This clutters up the download folders with useless files. We do want clean this up and the intent is to write video ID's to a log file as they fail to download and check against this on subsequent passes.

Current non-zero exit code checks for TDCLI are crude. Corner cases such as a Twitch service being unavailable or your own network/ISP issues could cause it to erroneously flag a chat log as failed. Will need examples of TDCLI erroring out in scenarios like these.

Discord webhook notifications

Subject TL;DR:

Want to eventually add Discord webhook functionality to send out status notifications as certain tasks are completed, fail, etc..

Low priority feature enhancement

Build docker container

This would be nice to build as a docker container even as a simple script based system.

Ultimately would like to make it compatible with Unraid's Community Applications system.

Low priority feature request/enhancement

VTT generation produces files that show one word at a time (as viewed via VLC)

Initial testing of the VTT render code and playing back in VLC shows that it displays a single word at a time on screen. Accuracy at first blush seems good and timing is perfect otherwise.

More research needs to be done here. Plan is to either find out how to fix the VTT file generation or skip it and just have Vosk write out an SRT file instead.

This is low priority.

Automatic TDCLI version checker and downloader / VOSK downloader

An extra feature I am looking into adding is making the process automatic for downloading and maintaining the TDCLI and Vosk prerequisites.

Ideas:

  • Include a TDCLI version number in this repo. Seems they have made changes often enough that I need to account for so it may be worth just maintaining a target supported version. It'll be able to be changed in code if someone wants to 'roll their own'.

  • Check for an existing TDCLI download and check its version. If it doesn't exist or the version is incorrect, we can grab a corrected version.

  • Github has a json response listing available releases for a repo that we can use: https://api.github.com/repos/lay295/TwitchDownloader/releases/latest

  • Use some OS checking tools in python to verify OS type, 32 vs 64 bit, etc.. Use some fuzzy wildcard naming to grab the correct option and download link. Download and unzip in place.

  • Vosk download should be easy. Models don't appear to have changed in ages so for all intents we can just have a static download URL. Either at startup or when the SRT task rolls around, check if we need to use the SRT stuff and if so, do the same song and dance of checking the files in repo (may get fancy and hash the individual files and verify they all exist and re-download/unzip if not?) and download and unzip.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.