Giter Club home page Giter Club logo

audiobook-chapter-splitter's Introduction

audiobook-chapter-splitter

A tool to split monolithic audiobook files (mp3 or any format that FFmpeg and Whisper/Vosk can understand) into one file per chapter. Based on speech recognition (whisper or vosk) and finding a user defined keyword to separate chapters. Inspired by Dan Gravell's blog post Splitting audiobooks into chapters with AI and crossed fingers.

The steps can be summarized as follows:

  1. Transcribe the audiobook file to a srt file using Whisper or Vosk.
  2. Use grep to find the keyword that separates chapters and the timestamps they occur at in the srt file.
  3. Use ffmpeg to split the audiobook file into one file per chapter based on the timestamps.

My own use case is to split audiobooks into chapters, but it will work for any type of audio file in any language supported by the speech recognition tool, and that has a keyword that separates sections.

Prerequisites

You need either Whisper or Vosk installed on your system. If both are installed, Whisper will be used by default if nothing else is specified using the command-line arguments.

pip3 install vosk
# or
pip3 install openai-whisper

You also need to have FFmpeg installed on your system.

# debian-based
sudo apt install ffmpeg

Installation

# clone the repository
git clone https://github.com/dahlo/audiobook-chapter-splitter.git

Usage

  audiobook-chapter-splitter.sh
  -------------------------------------
  This script splits an audiobook file into chapters.

  Usage:
  audiobook-chapter-splitter.sh -i <input file> -o <output directory> -c <chapter keyword> [-w] [-v] [-a ARGS] [-h]
  ex.
  audiobook-chapter-splitter.sh -i audiobook.mp3 -o chapters -c kapitel -w -a "--model medium --language Swedish"

  Options:
    -i: Path to the input file
    -o: Path to the output directory
    -c: Keyword that is used to identify breakpoints between chapters (case-insensitive)
    -w: Use this flag to use OpenAIs Whisper for transcription
    -v: Use this flag to use Vosk for transcription
    -a: Pass any additional arguments to transcriber using this flag
    -h: Print this help message

audiobook-chapter-splitter's People

Contributors

dahlo avatar

Stargazers

Bill D. Strong avatar Dan Gravell avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.