A tool to split monolithic audiobook files (mp3
or any format that FFmpeg and Whisper/Vosk can understand) into one file per chapter. Based on speech recognition (whisper or vosk) and finding a user defined keyword to separate chapters. Inspired by Dan Gravell's blog post Splitting audiobooks into chapters with AI and crossed fingers.
The steps can be summarized as follows:
- Transcribe the audiobook file to a
srt
file using Whisper or Vosk. - Use
grep
to find the keyword that separates chapters and the timestamps they occur at in thesrt
file. - Use
ffmpeg
to split the audiobook file into one file per chapter based on the timestamps.
My own use case is to split audiobooks into chapters, but it will work for any type of audio file in any language supported by the speech recognition tool, and that has a keyword that separates sections.
You need either Whisper or Vosk installed on your system. If both are installed, Whisper will be used by default if nothing else is specified using the command-line arguments.
pip3 install vosk
# or
pip3 install openai-whisper
You also need to have FFmpeg installed on your system.
# debian-based
sudo apt install ffmpeg
# clone the repository
git clone https://github.com/dahlo/audiobook-chapter-splitter.git
audiobook-chapter-splitter.sh
-------------------------------------
This script splits an audiobook file into chapters.
Usage:
audiobook-chapter-splitter.sh -i <input file> -o <output directory> -c <chapter keyword> [-w] [-v] [-a ARGS] [-h]
ex.
audiobook-chapter-splitter.sh -i audiobook.mp3 -o chapters -c kapitel -w -a "--model medium --language Swedish"
Options:
-i: Path to the input file
-o: Path to the output directory
-c: Keyword that is used to identify breakpoints between chapters (case-insensitive)
-w: Use this flag to use OpenAIs Whisper for transcription
-v: Use this flag to use Vosk for transcription
-a: Pass any additional arguments to transcriber using this flag
-h: Print this help message