Giter Club home page Giter Club logo

fastspeech2's People

Contributors

ga642381 avatar ming024 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fastspeech2's Issues

Training for Korean Languages

Hello authors,
First of all, thank you for giving us an impressive repository.
For now, I want to re-trained your model with Korean language, for example KSS (korean single speaker). However, when I synthesize, I see it is not good for korean language. Can you give me some guidelines for that.

Thank you very much

Invalid tensor shape

VCTK. Model training OK. Run synthesize.py

Additional diagnostic just before crash:

print("Print")
print(text)
print(src_len)
print(spk_ids)

Using cache found in /root/.cache/torch/hub/descriptinc_melgan-neurips_master
Synthesizing...
Weather forecast for tonight: dark
|{W EH1 DH ER0 F AO1 R K AE2 S T F AO1 R T AH0 N AY1 T D AA1 R K}|
Print
tensor([[144, 94, 91, 97, 104, 78, 130, 116, 71, 131, 133, 104, 78, 130,
133, 73, 119, 86, 133, 90, 66, 130, 116],
[144, 94, 91, 97, 104, 78, 130, 116, 71, 131, 133, 104, 78, 130,
133, 73, 119, 86, 133, 90, 66, 130, 116],
[144, 94, 91, 97, 104, 78, 130, 116, 71, 131, 133, 104, 78, 130,
133, 73, 119, 86, 133, 90, 66, 130, 116]], device='cuda:0')
tensor([23, 23, 23], device='cuda:0')
tensor([5, 6, 7], device='cuda:0')
Traceback (most recent call last):
File "synthesize.py", line 168, in
synthesize(model, waveglow, melgan, text, sentence, prefix='step_{}'.format(args.step))
File "synthesize.py", line 85, in synthesize
mel, mel_postnet, log_duration_output, f0_output, energy_output, _, _, mel_len = model(text, src_len, speaker_ids=spk_ids)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/parallel/data_parallel.py", line 169, in forward
return self.gather(outputs, self.output_device)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/parallel/data_parallel.py", line 181, in gather
return gather(outputs, output_device, dim=self.dim)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/parallel/scatter_gather.py", line 78, in gather
res = gather_map(outputs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/parallel/scatter_gather.py", line 73, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
File "/usr/local/lib/python3.8/dist-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map
return Gather.apply(target_device, dim, *outputs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/parallel/_functions.py", line 75, in forward
return comm.gather(inputs, ctx.dim, ctx.target_device)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/parallel/comm.py", line 235, in gather
return torch._C._gather(tensors, dim, destination)
RuntimeError: Input tensor at index 1 has invalid shape [1, 293, 80], but expected [1, 371, 80]

Division by Zero

VCTK corpus, not all words present.

step : 1000
Epoch [1/1000], Step [1000/2688000]:
Total Loss: 2.4196, Mel Loss: 0.2047, Mel PostNet Loss: 0.2929, Duration Loss: 0.3105, F0 Loss: 72.4894, Energy Loss: 8.8656;
Time Used: 300.306s, Estimated Time Remaining: 781687.842s.
step : 2000
Epoch [1/1000], Step [2000/2688000]:
Total Loss: 1.8951, Mel Loss: 0.1795, Mel PostNet Loss: 0.1984, Duration Loss: 0.2650, F0 Loss: 48.0024, Energy Loss: 7.7215;
Time Used: 598.532s, Estimated Time Remaining: 686206.710s.
step: 2000 , length 249, tensor([249, 363, 264, 352, 281, 264, 221, 275, 264, 221, 307, 277, 298, 266,
286, 268], device='cuda:0')
Traceback (most recent call last):
File "train.py", line 265, in
main(args)
File "train.py", line 239, in main
d_l, f_l, e_l, m_l, m_p_l = evaluate(model, current_step)
File "/home/FastSpeech2/evaluate.py", line 115, in evaluate
d_l = sum(d_l) / len(d_l)
ZeroDivisionError: division by zero

Pretrained model

Can you offer the Pretrained model on those LibriTTS dataset ?
Thx

ValueError: num_samples should be a positive integer value, but got num_samples=0

File "train.py", line 109, in __init_dataset
train_loader = DataLoader(
File "/home/ssn/.local/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 344, in init
sampler = RandomSampler(dataset, generator=generator) # type: ignore[arg-type]
File "/home/ssn/.local/lib/python3.8/site-packages/torch/utils/data/sampler.py", line 107, in init
raise ValueError("num_samples should be a positive integer "
ValueError: num_samples should be a positive integer value, but got num_samples=0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.