Giter Club home page Giter Club logo

Comments (6)

chazo1994 avatar chazo1994 commented on June 11, 2024

I use demo_copy_synthesis_low_dim script and demo_copy_synthesis_lossless script to analysis, synthesis speech but demo_copy_synthesis_lossless have better result.

from magphase.

felipeespic avatar felipeespic commented on June 11, 2024

Hi @chazo1994
I fixed the issue.

So, download the newest MagPhase and disable the postfilter. The postfilter is only needed for TTS, for copy-synthesis, it is damaging.

Also, there is a recipe in Merlin (https://github.com/CSTR-Edinburgh/merlin/tree/master/egs/slt_arctic/s2), which is tested and works at 16kHz sample rate. I suggest using that recipe as starting point for your experiments. Actually, I plan to remove the demo files for Merlin soon, since they are not needed anymore.

Let me know if that works now please, to close this issue.
Thanks

from magphase.

chazo1994 avatar chazo1994 commented on June 11, 2024

Hi @felipeespic
I try newest Magphase with disable the postfilter and I got better result of voice quality of demo_run_for_merlin. But it still has lower voice quality than demo_copy_synthesis_lossless. And voice quality of demo_copy_synthesis_low_dim is also lower than demo_copy_synthesis_lossless. I give you the samples of source audio and its result in the link below.
https://drive.google.com/file/d/10jJrU9Mqbho1B2vIQKz7k-dRcU9DHB0n/view?usp=drivesdk

Also, I have used a recipe in Merlin in recent months, and I got same low quality of vocoder module with my issue mentioned.

from magphase.

chazo1994 avatar chazo1994 commented on June 11, 2024

Hi felipeespic
I changed the permission of my file above to any one can edit. I'm sorry my mistakle

from magphase.

felipeespic avatar felipeespic commented on June 11, 2024

Hi @chazo1994

It is expected that the script demo_copy_synthesis_lossless achieves higher quality than demo_copy_synthesis_low_dim, since the former is a lossless ("without loss") operation. In other words, the synthesised signal should be exactly the same as the original waveform, without loss. Whilst demo_copy_synthesis_low_dim compresses the parameters to a lower dimension to make them suitable for typical TTS systems, as Merlin. For example, the lossless mag dim is fft_len/2 (e.g., 1024), while its low dim version is just 60 dimension.

As far as I know, nobody have tried to use MagPhase lossless features to build a TTS system so far, although you could in theory, but it would consume too much resources. Actually, that's an experiment that I want to run in the future.

from magphase.

chazo1994 avatar chazo1994 commented on June 11, 2024

Hi @felipeespic
Thank you so much, I got this. After trainning with merlin, quality of the system with magphase vocoder inferior than World vocoder. And demo_copy_synthesis_low_dim has voice quality lower than world vocoder too. But demo_copy_synthesis_low_dim is better than world vocoder, so may be I will try to use MagPhase lossless with Merlin or increase 60 dimension to improve my TTS system. Because I have large resources for computing :) .

from magphase.

Related Issues (13)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.