Giter Club home page Giter Club logo

styletts_server's Introduction

Style TTS Server

We have developed a Text-to-Speech (TTS) system utilizing StyleTTS2 , which can efficiently process and serve 100 characters in just 410 milliseconds on an AWS G5 large instance. This system is implemented as a simple HTTP server, enabling straightforward integration and usage. With our TTS system, users can leverage advanced features of StyleTTS2, such as voice cloning and text-to-audio conversion. This allows for the creation of high-quality, natural-sounding audio from text input with remarkable speed and accuracy, making it a powerful tool for various applications.

Installation

python3 -m venv env
. ./env/bin/activate
pip install -r requirements.txt

Start Server

python3 main.py

by default it use's port 8700

Making API call

import requests
import json
from base64 import b64decode
headers = {
    'accept': 'application/json',
    'Content-Type': 'application/json',
}

json_data = {
    'text': 'hello world i am R Ansh Joseph whats your name',
    'rate':8000,
    'voice_id': 'default',
    'alpha': 0.3,
    'beta': 0.7,
    'diffusion_steps': 5,
    'embedding_scale': 1,
}
import time
prev = time.time()
response = requests.post('http://127.0.0.1:8700/tts', headers=headers, json=json_data)
response = json.loads(response.text)
print(time.time() - prev)
with open("audio.wav",'wb') as file:
    file.write(b64decode(response['audio']))

note:- you can change the audio sample rate by changing the rate in json_data and you can change the voice by altering voice_id

Adding More Voice's

to add more voice you have to put audio file to voices dir and file name is voice_id for that voice

For Example

not found

by default we have a default.wav in voices folder but if you have to add new voice you have put a new audio file in this folder , some thing like this

not found

now if you want to access new audio you have to simply you this payload according to this example

json_data = {
    'text': 'hello world i am R Ansh Joseph whats your name',
    'rate':8000,
    'voice_id': 'ansh',
    'alpha': 0.3,
    'beta': 0.7,
    'diffusion_steps': 5,
    'embedding_scale': 1,
}

note: if you add new voice at the time server is on then restart the server

FOR SINGLE WORD MODEL CREATE WEIRD SOUND

styletts_server's People

Contributors

anshjoseph avatar

Stargazers

Nguyen Viet Hoang avatar Dave Rauchwerk avatar

Watchers

Prateek Sachan avatar Marmik Pandya avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.