Comments (9)
@Liujingxiu23 Thanks, I just found I calculate time in a wrong way, to generate mel and mel_postnet, it takes about 0.05 using P100.
from fastspeech.
How about your wave synthesized? I synthesized wavs using model checkpoint "checkpoint_40000.pth.tar". The quality is bad.
from fastspeech.
Total loss is as follows (I cut top 30000):
from fastspeech.
I will put result wav here soon.
from fastspeech.
@xcmyz Can you tell me how long do you need to synthesis a speech? Thanks!
from fastspeech.
@xcmyz Thank you for your reply. How many iters the model need to converge ? I use the LJSpeech dataset
from fastspeech.
@Liujingxiu23 My loss is about 0.2, not good either. How long do you need to synthesis a new speech?
from fastspeech.
mel_output, mel_output_postnet = model(src_seq, src_pos)
mel = mel_output_postnet[k].detach().cpu().numpy()
wav = audio.inv_mel_spectrogram(mel.T)
For batch-size=10, seq_len=75, using one GPU, the time spend:
- to generate mel and mel_postnet : 0.159 sec
- to generate mel, mel_postnet and wavs: 37.625 sec
from fastspeech.
@xcmyz I tried your laest code, the acoustic quality improve much, early the same as tacotron2 I think.
The TTS corpus I use is chinese, and I keep the default hparams setting.
My loss seems not as good as yours, the postnet-mel-loss converge to about 0.5, the duration loss about 0.8. I don not know why?
By the way the pronunciation as well as the tone is not that good. For example, in same wavs "zhang" read like "zhan", “tao3” read like “tao2” ,why this happens? Do you have any suggest to solve this problem?
from fastspeech.
Related Issues (20)
- RuntimeError: shape '[1, 1, 155520]' is invalid for input of size 311040 [custom data training] HOT 3
- training stops in few seconds and no checkpoint file created HOT 3
- error in new commit HOT 9
- How to get alignment? HOT 8
- Preprocess.py got stuck: Tried to debug
- Have anyone tried using LSTM to replace FFT block?
- What is the difference between postnet and CBHG? HOT 3
- 请问训练多久得到的pretrain model呢?然后,请问使用了多少GPU呢? HOT 1
- Resume training from checkpoint result in NaN? HOT 1
- How to extract alignment from tacotron2? HOT 6
- long int 转换成float erro
- denoiser HOT 1
- some question about squeezewave denoiser
- onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from fastwave.onnx failed:Type Error: Type parameter (T) bound to different types (tensor(double) and tensor(float) in node ().
- Expected object of backend CUDA but got backend CPU for argument #3 'index' HOT 1
- duration loss calulated in log domain or linear domain
- wav in chinese HOT 1
- Error when training new model for another language
- BUG:OSError: sndfile library not found HOT 1
- RuntimeError: stack expects each tensor to be equal size, but got [40, 240] at entry 0 and [78, 202] at entry 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fastspeech.