Tacotron 2

A PyTorch implementation of Tacotron2, described in Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions, an end-to-end text-to-speech(TTS) neural network architecture, which directly converts character text sequence to speech.

Install

Python3.6+ (Recommend Anaconda)
PyTorch 0.4.1+
pip install -r requirements.txt
If you want to run egs/ljspeech/run.sh, download LJ Speech Dataset for free.

Usage

Quick start

$ cd egs/ljspeech
# Modify wav_dir to your LJ Speech dir
$ bash run.sh

That's all.

You can change parameter by $ bash run.sh --parameter_name parameter_value, egs, $ bash run.sh --stage 2. See parameter name in egs/ljspeech/run.sh before . utils/parse_options.sh.

Workflow

Workflow of egs/ljspeech/run.sh:

Stage 1: Training
Stage 2: Synthesising

More detail

egs/ljspeech/run.sh provide example usage.

# Set PATH and PYTHONPATH
$ cd egs/ljspeech/; . ./path.sh
# Train:
$ train.py -h
# Synthesis audio:
$ synthesis.py -h

How to visualize loss?

If you want to visualize your loss, you can use visdom to do that:

Open a new terminal in your remote server (recommend tmux) and run $ visdom
Open a new terminal and run $ bash run.sh --visdom 1 --visdom_id "<any-string>" or $ train.py ... --visdom 1 --vidsdom_id "<any-string>"
Open your browser and type <your-remote-server-ip>:8097, egs, 127.0.0.1:8097
In visdom website, chose <any-string> in Environment to see your loss

How to resume training?

$ bash run.sh --continue_from <model-path>

How to use multi-GPU?

Use comma separated gpu-id sequence, such as:

$ bash run.sh --id "0,1"

How to solve out of memory?

When happened in training, try to reduce batch_size or use more GPU. $ bash run.sh --batch_size <lower-value> or $ bash run.sh --id "0,1".

Reference and Resource

NOTE

This is a work in progress and any contribution is welcome (dev branch is main development branch).

I implement feature prediction network + Griffin-Lim to synthesis speech now.

Attention and synthesised audio on 37k iterations:

kaituoxu / tacotron2 Goto Github PK

tacotron2's Introduction

Tacotron 2

Install

Usage

Quick start

Workflow

More detail

How to visualize loss?

How to resume training?

How to use multi-GPU?

How to solve out of memory?

Reference and Resource

NOTE

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent