Giter Club home page Giter Club logo

Comments (5)

TaskManager91 avatar TaskManager91 commented on May 27, 2024 1

Using "tts" or "tts-server" works like a charm. Only the Python Integration appears to be impacted by the issue. Unfortunately, using the command line method would reload the model each time i use it in my code, which has a significant impact on performance.

Coqui-TTS has a new Fairseq integration.

from thorsten-voice.

thorstenMueller avatar thorstenMueller commented on May 27, 2024 1

I did some further testing using your text "Öffentliche trübe Tragödie" on Microsoft Windows using my Thorsten-DDC model.

Running:

espeak-ng "Öffentliche trübe Tragödie" --ipa

results in this weird looking result:

'??f?ntl??t? t?'u?b t?'ad???di

Running it on my Mac using espeak (not -ng) shows this better looking output:

ˌɜːfəntlˈiːʃ tɹˈuːb tɹˈadʒɜːdi

Finally i removed espeak-ng on Windows, installed espeak (1.48.04) and added the exe file to my PATH environment var. Then tried running your script with my Thorsten-DDC model and the Umlauts are spoken correctly.
So it seems to be an issue with Umlauts, espeak-ng and Microsoft Windows. On a quick research i found this open issue.

espeak-ng/espeak-ng#1555

Hope this helps you 😊.

from thorsten-voice.

TaskManager91 avatar TaskManager91 commented on May 27, 2024 1

Perfect - Thanks a lot!

from thorsten-voice.

thorstenMueller avatar thorstenMueller commented on May 27, 2024

Hi, i'll try to check it soon - when sitting at my Windows machine 😉. Does it work by just running "tts" or "tts-server" and is struggling with Python integration?
Just being curious, what that "/deu/fairseq/vits" is about, as i don't see it here.

from thorsten-voice.

thorstenMueller avatar thorstenMueller commented on May 27, 2024

Hi @TaskManager91 , i can confirm the problem. I've tried your snipplet (just without gpu) on Mac OS successfully and run into the same problem on Windows.

from TTS.api import TTS
tts_model = TTS(model_name="tts_models/de/thorsten/tacotron2-DDC", gpu=False)
tts_model.tts_to_file("Öffentliche trübe Tragödie", file_path="voice_bot.wav")

To check if it's not a general Windows encoding problem on my system i've tested it with my Thorsten-VITS model. This model does not have the problem.
Maybe you can test it using this model:

tts_models/de/thorsten/vits

The DDC model has been trained with espeak-ng and the VITS model using gruut as phonemizer. Maybe that's a problem. I'll try to figure out what might be the problem.

from thorsten-voice.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.