Download || Release Page || Installer Repo
This code provides a Gradio interface for generating audio from text input using the Bark TTS and Tortoise TTS models. The interface takes a text prompt as input and generates audio as output.
audio__bark__continued_generation__2023-05-04_16-07-49_long.webm
audio__bark__continued_generation__2023-05-04_16-09-21_long.webm
audio__bark__continued_generation__2023-05-04_16-10-55_long.webm
https://rsxdalv.github.io/bark-speaker-directory/ (https://github.com/rsxdalv/bark-speaker-directory)
This code requires the following dependencies:
bark
in models/bark directory from https://github.com/suno-ai/barkscipy
gradio
June 18:
- Update to newest audiocraft, add longer generations
...
June 5:
- Fix "Save to Favorites" button on bark generation page, clean up console (v4.1.1)
- Add "Collections" tab for managing several different data sets and easier curration.
June 4:
- Update to v4.1 - improved hash function, code improvements
June 3:
- Update to v4 - new output structure, improved history view, codebase reorganization, improved metadata, output extensions support
May __:
- Update to v3 - voice clone demo
May 17:
- Update to v2 - generate results as they appear, preview long prompt generations piece by piece, enable up to 9 outputs, UI tweaks
May 16:
- Add gradio settings tab, fix gradio errors in console, improve logging.
- Update History and Favorites with "use as voice" and "save voice" buttons
- Add voices tab
- Bark tab: Remove "or Use last generation as history"
- Improve code organization
May 13:
- Enable deterministic generation and enhance generated logs. Credits to suno-ai/bark#175.
May 10:
- Enable the possibility of reusing history prompts from older generations. Save generations as npz files. Add a convenient method of reusing any of the last 3 generations for the next prompts. Add a button for saving and collecting history prompts under /voices. rsxdalv#10
May 4:
- Long form generation (credits to https://github.com/suno-ai/bark/blob/main/notebooks/long_form_generation.ipynb and suno-ai/bark#161)
- Adapt to fixed env var bug
May 3:
- Improved Tortoise UI: Voice, Preset and CVVP settings as well as ability to generate 3 results (rsxdalv#6)
May 2 Update 2:
- Added support for history recylcing to continue longer prompts manually
May 2 Update 1:
- Added support for v2 prompts
Before:
- Added support for Tortoise TTS
git clone https://github.com/rsxdalv/bark.git
This project utilizes the following open source libraries:
-
suno-ai/bark - MIT License
- Description: A powerful library for XYZ.
- Repository: suno/bark
-
tortoise-tts - Apache-2.0 License
- Description: A flexible text-to-speech synthesis library for various platforms.
- Repository: neonbjb/tortoise-tts
-
ffmpeg - LGPL License
- Description: A complete and cross-platform solution for video and audio processing.
- Repository: FFmpeg
- Use: Encoding Vorbis Ogg files
-
ffmpeg-python - Apache 2.0 License
- Description: Python bindings for FFmpeg library for handling multimedia files.
- Repository: kkroening/ffmpeg-python
-
audiocraft - MIT License
- Description: A library for audio generation and MusicGen.
- Repository: facebookresearch/audiocraft
-
vocos - MIT License
- Description: An improved decoder for encodec samples
- Repository: charactr-platform/vocos