audiobook-chapter-splitter's Introduction

audiobook-chapter-splitter

A tool to split monolithic audiobook files (mp3 or any format that FFmpeg and Whisper/Vosk can understand) into one file per chapter. Based on speech recognition (whisper or vosk) and finding a user defined keyword to separate chapters. Inspired by Dan Gravell's blog post Splitting audiobooks into chapters with AI and crossed fingers.

The steps can be summarized as follows:

Transcribe the audiobook file to a srt file using Whisper or Vosk.
Use grep to find the keyword that separates chapters and the timestamps they occur at in the srt file.
Use ffmpeg to split the audiobook file into one file per chapter based on the timestamps.

My own use case is to split audiobooks into chapters, but it will work for any type of audio file in any language supported by the speech recognition tool, and that has a keyword that separates sections.

Prerequisites

You need either Whisper or Vosk installed on your system. If both are installed, Whisper will be used by default if nothing else is specified using the command-line arguments.

pip3 install vosk
# or
pip3 install openai-whisper

You also need to have FFmpeg installed on your system.

# debian-based
sudo apt install ffmpeg

Installation

# clone the repository
git clone https://github.com/dahlo/audiobook-chapter-splitter.git

Usage

  audiobook-chapter-splitter.sh
  -------------------------------------
  This script splits an audiobook file into chapters.

  Usage:
  audiobook-chapter-splitter.sh -i <input file> -o <output directory> -c <chapter keyword> [-w] [-v] [-a ARGS] [-h]
  ex.
  audiobook-chapter-splitter.sh -i audiobook.mp3 -o chapters -c kapitel -w -a "--model medium --language Swedish"

  Options:
    -i: Path to the input file
    -o: Path to the output directory
    -c: Keyword that is used to identify breakpoints between chapters (case-insensitive)
    -w: Use this flag to use OpenAIs Whisper for transcription
    -v: Use this flag to use Vosk for transcription
    -a: Pass any additional arguments to transcriber using this flag
    -h: Print this help message

Recommend Projects

dahlo / audiobook-chapter-splitter Goto Github PK