Giter Club home page Giter Club logo

stl3 / alltalk_tts Goto Github PK

View Code? Open in Web Editor NEW

This project forked from erew123/alltalk_tts

0.0 0.0 0.0 8.27 MB

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.

Python 89.85% CSS 0.13% HTML 10.02%

alltalk_tts's Introduction

AllTalk TTS

AllTalk is an updated version of the Coqui_tts extension for Text Generation web UI. Features include:

  • Custom Start-up Settings: Adjust your default start-up settings. Screenshot
  • Narrarator: Use different voices for main character and narration. Example Narration
  • Low VRAM mode: Improve generation performance if your VRAM is filled by your LLM. Screenshot
  • DeepSpeed: When DeepSpeed is installed you can get a 3-4x performance boost generating TTS.
  • Local/Custom models: Use any of the XTTSv2 models (API Local and XTTSv2 Local).
  • Optional wav file maintenance: Configurable deletion of old output wav files. Screenshot
  • Documentation: Fully documented with a built in webpage. Screenshot
  • Console output Clear command line output for any warnings or issues.
  • Standalone/3rd Party support via JSON calls Can be used with 3rd party applications via JSON calls.

Updates

The latest build (13 Dec 2023) has had the entire text filtering engine and narration engine rebuilt from scratch. It's highly complicated how its actually working, but the end result it a much clearer TTS output and much better control over the narrator option and how to handle text that isnt within quotes or asterisks. Its a highly recommened update, for the improved quality it gives to the TTS output, if nothing else.

Should you want the older version of the narrator engine+text filtering, I will leave this older copy here

Installation on Text generation web UI

This has been tested on the current Dec 2023 release of Text generation webUI. If you have not updated it for a while, you may wish to update Text generation webUI, instructions here

  1. In a command prompt/terminal window you need to move into your Text generation webUI folder:

cd text-generation-webui

  1. Start the Text generation webUI Python environment for your OS:

cmd_windows.bat, ./cmd_linux.sh, cmd_macos.sh or cmd_wsl.bat

  1. Move into your extensions folder:

cd extensions

  1. Once there git clone this repository:

git clone https://github.com/erew123/alltalk_tts

  1. Move into the alltalk_tts folder:

cd alltalk_tts

  1. Install the requirements:

Nvidia graphics card machines - pip install -r requirements_nvidia.txt

Other machines (mac, amd etc) - pip install -r requirements_other.txt

  1. You can now start move back to the main Text generation webUI folder cd .. (a few times), start Text generation webUI (start_windows.bat,./start_linux.sh, start_macos.sh or start_wsl.bat) and load the AllTalk extension in the Text generation webUI session tab.

Note: It can take a while to start up. Check the command prompt/terminal window if you want to know what its doing. After it says "Model Loaded" the Text generation webUI is usually available on its IP address a few seconds later, for you to connect to in your browser.

Documentation: Click on the link when inside Text generation webUI as shown in the screenshot here

Where to find voices https://aiartes.com/voiceai or https://commons.wikimedia.org/ or interviews on youtube etc. Instructions on how to cut down and prepare a voice sample are within the built in documentation.

Other installation notes

On first startup, AllTalk will download the Coqui XTTSv2 2.0.2 model to its models folder (1.8GB space required). You can customse your model or use the TTS latest model within the interface (details in documentation).

Once the extension is loaded, please find all documentation and settings on the link provided in the interface (as shown in the screenshot below).

To start AllTalk every Text generation webUI loads, edit the Text generation webUI CMD_FLAGS.txt file in the main text-generation-webui folder and add --extensions alltalk_tts.

Screenshots

image image

alltalk_tts's People

Contributors

erew123 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.