jinhan / tacotron2-vae Goto Github PK

View Code? Open in Web Editor NEW

166.0 166.0 33.0 19.88 MB

Implementation of "Learning Latent Representations for Style Control and Transfer in End-to-end Speech Synthesis"

License: BSD 3-Clause "New" or "Revised" License

Python 2.06% Jupyter Notebook 97.80% CSS 0.01% JavaScript 0.06% HTML 0.07%

tacotron2-vae's People

Contributors

Stargazers

Watchers

tacotron2-vae's Issues

Pretrained model?

This project looks very interesting, could we have a pretrained model, please?
I couldn't download the dataset from the link you provided because I never receive the confirmation number when I register to the website with my student email.
Thanks for your kind attention.

Speaker ID in inference notebook

Thank you for this amazing implementation. Everything is working perfectly except I wasn't able to figure out how to select speakers during inference (I have trained with 5 speakers).

Any hints would be really appreciated!

inference speed on CPU

Hi.
I am exploring about speed of training and inference different multi speaker TTS models on single CPU and on singe GPU.
Thanks for any explanation in this case for current model or any other models of multi speaker TTS.

Using Phoneme

Can I ask that in your language you are using, Korean, the input you are using is phoneme or Korean words?
Because I am trying to use phoneme in my language, Do you have any suggestions for me!

dataset

Excuse me，in u code, i found you also used the english dataset — IEMOCAP，the format like this:
/home/jinhan/Storage/IEMOCAP/IEMOCAP_processed/session4_M/neu/psn4M_neu_s212_orgn.wav|whose is it|7|0
/home/jinhan/Storage/IEMOCAP/IEMOCAP_processed/session3_F/neu/psn3F_neu_s497_orgn.wav|eight pieces of luggage okay|4|0

Cloud you share the wav file and the transcript of them,i have some problem about processing the orignal dataset, my email is :[email protected], hope the answer！ Thanks very much！

jinhan / tacotron2-vae Goto Github PK

tacotron2-vae's People

Contributors

Stargazers

Watchers

Forkers

tacotron2-vae's Issues

Pretrained model?

Speaker ID in inference notebook

inference speed on CPU

Using Phoneme

dataset

I couldn't download the dataset from the link you provided

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent