mycroftai / mimic3 Goto Github PK

View Code? Open in Web Editor NEW

982.0 20.0 89.0 2.05 MB

A fast local neural text to speech engine for Mycroft

License: GNU Affero General Public License v3.0

Shell 8.83% Python 79.61% HTML 8.93% Makefile 1.06% Dockerfile 1.56%

mimic3's Introduction

Mimic 3

A fast and local neural text to speech system developed by Mycroft for the Mark II.

Quickstart

Mycroft TTS Plugin

# Install system packages
sudo apt-get install libespeak-ng1

# Ensure that you're using the latest pip
mycroft-pip install --upgrade pip

# Install plugin
mycroft-pip install mycroft-plugin-tts-mimic3[all]

# Activate plugin
mycroft-config set tts.module mimic3_tts_plug

# Start mycroft
mycroft-start all

See documentation for more details.

Web Server

mkdir -p "${HOME}/.local/share/mycroft/mimic3"
chmod a+rwx "${HOME}/.local/share/mycroft/mimic3"
docker run \
       -it \
       -p 59125:59125 \
       -v "${HOME}/.local/share/mycroft/mimic3:/home/mimic3/.local/share/mycroft/mimic3" \
       'mycroftai/mimic3'

Visit http://localhost:59125 or from another terminal:

curl -X POST --data 'Hello world.' --output - localhost:59125/api/tts | aplay

See documentation for more details.

Command-Line Tool

# Install system packages
sudo apt-get install libespeak-ng1

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate
pip3 install --upgrade pip

pip3 install mycroft-mimic3-tts[all]

Now you can run:

mimic3 'Hello world.' | aplay

Use mimic3-server and mimic3 --remote ... for repeated usage (much faster).

See documentation for more details.

License

Mimic 3 is available under the AGPL v3 license

mimic3's People

Contributors

Stargazers

Watchers

Forkers

techthiyanes cederom simrit1 ja-pa dit7ya fquirin jarbasal scpedicini ubizone ssahgal takov751 sthibaul collinsem brettmn tonydamage mrsipan jimkleine kustomzone kingpinzs adamkova jyapayne anikburagohain srividhyavijayaraghavan alex182 siul72 asll666 benliao chaos-observer aurokk dineshpabbi10 gilbert0571 darknet1982 ukaserge jtabet dmacinto pazoff soi-20 theresnotime jupyterjones lmamakos brian-nelson leonardo-caggianelli superfluouss roboticslab-uc3m garvin1997 callcolor iohzrd purplesparkle z3r01mpact ken2190 saradark taesuklee hugoguillen catspunch alex-haze1987 izuc wepobid moseti1 k2m5t2 minaiml ramblurr josephrp bejaeger paulozaffari mind-interfaces bhiln kione wweschen al-dim damanbaird tomschwartz94 iterationltd devhives lukas1h wily123 brunoais ninjayoto zizzerzazzerzuzz g-desousa vikingstudio armincod3r mycroftphoenix greicodexjm shubhkirti24 webritual erima2020 mastudy mrlasdt jrhopper

mimic3's Issues

TypeError: Object of type set is not JSON serializable

Describe the bug
Visiting the mimic3 web page results in empty drop-down fields and errors in the terminal. I have made no changes to mimic3. It worked yesterday, and now no longer does.

To Reproduce
Steps to reproduce the behaviour:
On Kubuntu 22.10

cd ~/Source/mycroftai
sudo apt install python3.10-venv
git clone https://github.com/MycroftAI/mimic3
cd mimic3
./install.sh
source .venv/bin/activate
mimic3-server
http://0.0.0.0:59125

Opening the web page looks like this.

Output from the command line looks like this:

ERROR:mimic3_http.app:Object of type set is not JSON serializable
Traceback (most recent call last):
  File "/home/alan/Source/mycroftai/mimic3/.venv/lib/python3.10/site-packages/quart/app.py", line 1673, in full_dispatch_request
    result = await self.dispatch_request(request_context)
  File "/home/alan/Source/mycroftai/mimic3/.venv/lib/python3.10/site-packages/quart/app.py", line 1718, in dispatch_request
    return await self.ensure_async(handler)(**request_.view_args)
  File "/home/alan/Source/mycroftai/mimic3/mimic3_http/app.py", line 270, in api_voices
    return jsonify(voice_dicts)
  File "/home/alan/Source/mycroftai/mimic3/.venv/lib/python3.10/site-packages/quart/json/__init__.py", line 32, in jsonify
    return current_app.json.response(*args, **kwargs)
  File "/home/alan/Source/mycroftai/mimic3/.venv/lib/python3.10/site-packages/quart/json/provider.py", line 205, in response
    return self._app.response_class(self.dumps(object_, **dump_args), mimetype=self.mimetype)
  File "/home/alan/Source/mycroftai/mimic3/.venv/lib/python3.10/site-packages/quart/json/provider.py", line 171, in dumps
    return json.dumps(object_, **kwargs)
  File "/usr/lib/python3.10/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/usr/lib/python3.10/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python3.10/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/home/alan/Source/mycroftai/mimic3/.venv/lib/python3.10/site-packages/quart/json/provider.py", line 114, in _default
    raise TypeError(f"Object of type {type(object_).__name__} is not JSON serializable")
TypeError: Object of type set is not JSON serializable

Expected behaviour
The web page should show like this, and be usable.

Environment (please complete the following information):

Device type: Laptop (ThinkPad T450)
OS: Kubuntu
Mycroft-core version: Cloned from master of this repo

Additional context
Worth noting this exact setup on this machine worked perfectly yesterday. I know this because I wrote a blog post about it. I left mimic running on my laptop, changed network a couple of times, came home, and tried again, and now it's completely failing. I even wiped the folder and re-cloned and reinstalled as per the instructions above.

Mispronunciation of the word "de" in all french models

Describe the bug
By listening to the French audio samples provided by Mimic 3, we can hear that it prononce the word "de" (of) as "dem" in every models and it really affect the way the sentences sounds, as "de" is a word very commonly used in French.

Expected behavior
Correct the pronunciation of the word "de" (of) in all french models, it is not the be confused with the pronunciation of the word "deux" (two) which are similar.

Deux is pronounced /dø/.
De is pronounced /də/.

https://www.masteryourfrench.com/how-to-pronounce/de-versus-deux/

If you need a French native speaker to help correct this issue please let me know.
If this is not the right place to post this please also let me know.

Thank you for your work on Mimic3, very impressive and accurate overall!

The docker image doesn't work

Describe the bug
The docker image doesn't work

To Reproduce

docker run --rm -p 59125:59125 mycroftai/mimic3
Navigate to localhost:59125
Hit the "speak" button

Result:

PermissionError: [Errno 13] Permission denied: '/home/mimic3/.local'

Expected behavior
Not an error.

Log files

INFO:__main__:Starting web server
[2023-02-06 17:27:27 +0000] [1] [INFO] Running on http://0.0.0.0:59125 (CTRL + C to quit)
INFO:hypercorn.error:Running on http://0.0.0.0:59125 (CTRL + C to quit)
ERROR:mimic3_http.synthesis:Error during inference
Traceback (most recent call last):
  File "/usr/lib/python3.9/pathlib.py", line 1312, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/mimic3/.local/share/mycroft/mimic3/voices/af_ZA/google-nwu_low'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/pathlib.py", line 1312, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/mimic3/.local/share/mycroft/mimic3/voices/af_ZA'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/pathlib.py", line 1312, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/mimic3/.local/share/mycroft/mimic3/voices'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/pathlib.py", line 1312, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/mimic3/.local/share/mycroft/mimic3'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/pathlib.py", line 1312, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/mimic3/.local/share/mycroft'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/pathlib.py", line 1312, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/mimic3/.local/share'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/mimic3/app/mimic3_http/synthesis.py", line 125, in do_synthesis_proc
    result = do_synthesis(item, mimic3)
  File "/home/mimic3/app/mimic3_http/synthesis.py", line 81, in do_synthesis
    raise e
  File "/home/mimic3/app/mimic3_http/synthesis.py", line 61, in do_synthesis
    mimic3.speak_text(params.text, text_language=params.text_language)
  File "/home/mimic3/app/mimic3_tts/tts.py", line 368, in speak_text
    voice = self._get_or_load_voice(self.voice)
  File "/home/mimic3/app/mimic3_tts/tts.py", line 569, in _get_or_load_voice
    maybe_model_dir = self._download_voice(voice_key)
  File "/home/mimic3/app/mimic3_tts/tts.py", line 621, in _download_voice
    download_voice(
  File "/home/mimic3/app/mimic3_tts/download.py", line 87, in download_voice
    voice_dir.mkdir(parents=True, exist_ok=True)
  File "/usr/lib/python3.9/pathlib.py", line 1316, in mkdir
    self.parent.mkdir(parents=True, exist_ok=True)
  File "/usr/lib/python3.9/pathlib.py", line 1316, in mkdir
    self.parent.mkdir(parents=True, exist_ok=True)
  File "/usr/lib/python3.9/pathlib.py", line 1316, in mkdir
    self.parent.mkdir(parents=True, exist_ok=True)
  [Previous line repeated 3 more times]
  File "/usr/lib/python3.9/pathlib.py", line 1312, in mkdir
    self._accessor.mkdir(self, mode)
PermissionError: [Errno 13] Permission denied: '/home/mimic3/.local'
ERROR:mimic3_http.app:[Errno 13] Permission denied: '/home/mimic3/.local'
Traceback (most recent call last):
  File "/usr/lib/python3.9/pathlib.py", line 1312, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/mimic3/.local/share/mycroft/mimic3/voices/af_ZA/google-nwu_low'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/pathlib.py", line 1312, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/mimic3/.local/share/mycroft/mimic3/voices/af_ZA'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/pathlib.py", line 1312, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/mimic3/.local/share/mycroft/mimic3/voices'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/pathlib.py", line 1312, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/mimic3/.local/share/mycroft/mimic3'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/pathlib.py", line 1312, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/mimic3/.local/share/mycroft'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/pathlib.py", line 1312, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/mimic3/.local/share'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/mimic3/app/.venv/lib/python3.9/site-packages/quart/app.py", line 1673, in full_dispatch_request
    result = await self.dispatch_request(request_context)
  File "/home/mimic3/app/.venv/lib/python3.9/site-packages/quart/app.py", line 1718, in dispatch_request
    return await self.ensure_async(handler)(**request_.view_args)
  File "/home/mimic3/app/mimic3_http/app.py", line 215, in app_tts
    wav_bytes = await text_to_wav(
  File "/home/mimic3/app/mimic3_http/app.py", line 103, in text_to_wav
    wav_bytes = await future
  File "/home/mimic3/app/mimic3_http/synthesis.py", line 125, in do_synthesis_proc
    result = do_synthesis(item, mimic3)
  File "/home/mimic3/app/mimic3_http/synthesis.py", line 81, in do_synthesis
    raise e
  File "/home/mimic3/app/mimic3_http/synthesis.py", line 61, in do_synthesis
    mimic3.speak_text(params.text, text_language=params.text_language)
  File "/home/mimic3/app/mimic3_tts/tts.py", line 368, in speak_text
    voice = self._get_or_load_voice(self.voice)
  File "/home/mimic3/app/mimic3_tts/tts.py", line 569, in _get_or_load_voice
    maybe_model_dir = self._download_voice(voice_key)
  File "/home/mimic3/app/mimic3_tts/tts.py", line 621, in _download_voice
    download_voice(
  File "/home/mimic3/app/mimic3_tts/download.py", line 87, in download_voice
    voice_dir.mkdir(parents=True, exist_ok=True)
  File "/usr/lib/python3.9/pathlib.py", line 1316, in mkdir
    self.parent.mkdir(parents=True, exist_ok=True)
  File "/usr/lib/python3.9/pathlib.py", line 1316, in mkdir
    self.parent.mkdir(parents=True, exist_ok=True)
  File "/usr/lib/python3.9/pathlib.py", line 1316, in mkdir
    self.parent.mkdir(parents=True, exist_ok=True)
  [Previous line repeated 3 more times]
  File "/usr/lib/python3.9/pathlib.py", line 1312, in mkdir
    self._accessor.mkdir(self, mode)
PermissionError: [Errno 13] Permission denied: '/home/mimic3/.local'

Environment (please complete the following information):

Device type: desktop
OS: linux mint 21.1
Mycroft-core version: ?
Other versions: ?

Additional context
Since the image doesn't work I tried to build my own and tweaking the Dockerfile, without success for the moment, I always end up with a missing libespeak-ng.so.1.

Python package documentation

Without consistent documentation is pretty hard to use the package
Mimic 3 is pretty impressive but without documentation is very hard to use in other projects.

There's a need for python Mimic 3 documentation
This would help other developers integrate the package in projects that right now have to rely on proprietary TTS engines like GTTS or sapi5. Mimic 3 is probably the frontier in the privacy-focused offline TTS world and it would receive wider adoption and more contributions if it were well-documented and easier to use.

Viseme support

I'm looking for an alternative to amazon polly and came across mimic3 tts which looks very promising, but for my use case I need visemes that polly would provide, are there any plans to provide this feature in mimic3 tts as well?

Fresh "From Source" install. No module named 'zoneinfo'

I just try to compile mimic3 from source on Rpi4 (OS -Yocto B2Qt image v 4.1.1). It compile fine. But when I try run ./mimic3-server in .venv/bin I get

./mimic3-server   
Traceback (most recent call last):
  File "/home/root/mimic3/.venv/lib/python3.10/site-packages/pytz_deprecation_shim/_compat_py3.py", line 5, in <module>
    import zoneinfo
ModuleNotFoundError: No module named 'zoneinfo'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/root/mimic3/.venv/bin/./mimic3-server", line 33, in <module>
    sys.exit(load_entry_point('mycroft-mimic3-tts', 'console_scripts', 'mimic3-server')())
  File "/home/root/mimic3/.venv/bin/./mimic3-server", line 25, in importlib_load_entry_point
    return next(matches).load()
  File "/usr/lib/python3.10/importlib/metadata/__init__.py", line 171, in load
    module = import_module(match.group('module'))
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/root/mimic3/mimic3_http/__main__.py", line 25, in <module>
    from .app import get_app
  File "/home/root/mimic3/mimic3_http/app.py", line 41, in <module>
    from mimic3_tts import DEFAULT_VOICE, Mimic3Settings, Mimic3TextToSpeechSystem
  File "/home/root/mimic3/mimic3_tts/__init__.py", line 17, in <module>
    from .tts import Mimic3Settings, Mimic3TextToSpeechSystem
  File "/home/root/mimic3/mimic3_tts/tts.py", line 53, in <module>
    from .voice import SPEAKER_TYPE, BreakType, Mimic3Voice
  File "/home/root/mimic3/mimic3_tts/voice.py", line 29, in <module>
    import gruut
  File "/home/root/mimic3/.venv/lib/python3.10/site-packages/gruut/__init__.py", line 13, in <module>
    from gruut.text_processor import Sentence, TextProcessor
  File "/home/root/mimic3/.venv/lib/python3.10/site-packages/gruut/text_processor.py", line 13, in <module>
    import dateparser
  File "/home/root/mimic3/.venv/lib/python3.10/site-packages/dateparser/__init__.py", line 3, in <module>
    from .date import DateDataParser
  File "/home/root/mimic3/.venv/lib/python3.10/site-packages/dateparser/date.py", line 7, in <module>
    from tzlocal import get_localzone
  File "/home/root/mimic3/.venv/lib/python3.10/site-packages/tzlocal/__init__.py", line 10, in <module>
    from tzlocal.unix import get_localzone, get_localzone_name, reload_localzone
  File "/home/root/mimic3/.venv/lib/python3.10/site-packages/tzlocal/unix.py", line 6, in <module>
    import pytz_deprecation_shim as pds
  File "/home/root/mimic3/.venv/lib/python3.10/site-packages/pytz_deprecation_shim/__init__.py", line 16, in <module>
    from . import helpers
  File "/home/root/mimic3/.venv/lib/python3.10/site-packages/pytz_deprecation_shim/helpers.py", line 5, in <module>
    from . import _common, _compat
  File "/home/root/mimic3/.venv/lib/python3.10/site-packages/pytz_deprecation_shim/_compat.py", line 6, in <module>
    from . import _compat_py3 as _compat_impl
  File "/home/root/mimic3/.venv/lib/python3.10/site-packages/pytz_deprecation_shim/_compat_py3.py", line 7, in <module>
    from backports import zoneinfo
ModuleNotFoundError: No module named 'backports'

[CONTRIBUTION] Speech Dataset Generator

Hi everyone!

My name is David Martin Rius and I have just published this project on GitHub: https://github.com/davidmartinrius/speech-dataset-generator/

Now you can create datasets automatically with any audio or lists of audios.

I hope you find it useful.

Here are the key functionalities of the project:

Dataset Generation: Creation of multilingual datasets with Mean Opinion Score (MOS).
Silence Removal: It includes a feature to remove silences from audio files, enhancing the overall quality.
Sound Quality Improvement: It improves the quality of the audio when needed.
Audio Segmentation: It can segment audio files within specified second ranges.
Transcription: The project transcribes the segmented audio, providing a textual representation.
Gender Identification: It identifies the gender of each speaker in the audio.
Pyannote Embeddings: Utilizes pyannote embeddings for speaker detection across multiple audio files.
Automatic Speaker Naming: Automatically assigns names to speakers detected in multiple audios.
Multiple Speaker Detection: Capable of detecting multiple speakers within each audio file.
Store speaker embeddings: The speakers are detected and stored in a Chroma database, so you do not need to assign a speaker name.
Syllabic and words-per-minute metrics

Feel free to explore the project at https://github.com/davidmartinrius/speech-dataset-generator

David Martin Rius

Strange soundings when using german voices installed via pip

certificate verify failed: unable to get local issuer certificate

The example with mimic3 'Hello world.' | aplay works just fine. However, when I start mimic3-server, I am greeted by some errors:

INFO:mimic3_http.__main__:Starting web server
ERROR:mimic3_http.app:Error setting up swagger UI page
Traceback (most recent call last):
  File "mimic3_http/app.py", line 338, in get_app
  File "swagger_ui/__init__.py", line 13, in api_doc
Exception: No match application isinstance type!
[2023-08-25 22:58:35 +0300] [70595] [INFO] Running on http://0.0.0.0:59125 (CTRL + C to quit)
INFO:hypercorn.error:Running on http://0.0.0.0:59125 (CTRL + C to quit)

Then, after opening the web-interface, if I select any language other than EN-UK, the web interface displays this error: URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)>. Below I give the more detailed output from the console:

ERROR:mimic3_http.synthesis:Error during inference
Traceback (most recent call last):
  File "urllib/request.py", line 1346, in do_open
  File "http/client.py", line 1255, in request
  File "http/client.py", line 1301, in _send_request
  File "http/client.py", line 1250, in endheaders
  File "http/client.py", line 1010, in _send_output
  File "http/client.py", line 950, in send
  File "http/client.py", line 1424, in connect
  File "ssl.py", line 500, in wrap_socket
  File "ssl.py", line 1040, in _create
  File "ssl.py", line 1309, in do_handshake
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "mimic3_http/synthesis.py", line 125, in do_synthesis_proc
  File "mimic3_http/synthesis.py", line 81, in do_synthesis
  File "mimic3_http/synthesis.py", line 61, in do_synthesis
  File "mimic3_tts/tts.py", line 368, in speak_text
  File "mimic3_tts/tts.py", line 569, in _get_or_load_voice
  File "mimic3_tts/tts.py", line 621, in _download_voice
  File "mimic3_tts/download.py", line 121, in download_voice
  File "urllib/request.py", line 214, in urlopen
  File "urllib/request.py", line 517, in open
  File "urllib/request.py", line 534, in _open
  File "urllib/request.py", line 494, in _call_chain
  File "urllib/request.py", line 1389, in https_open
  File "urllib/request.py", line 1349, in do_open
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)>
ERROR:mimic3_http.app:<urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)>
Traceback (most recent call last):
  File "urllib/request.py", line 1346, in do_open
  File "http/client.py", line 1255, in request
  File "http/client.py", line 1301, in _send_request
  File "http/client.py", line 1250, in endheaders
  File "http/client.py", line 1010, in _send_output
  File "http/client.py", line 950, in send
  File "http/client.py", line 1424, in connect
  File "ssl.py", line 500, in wrap_socket
  File "ssl.py", line 1040, in _create
  File "ssl.py", line 1309, in do_handshake
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "quart/app.py", line 1673, in full_dispatch_request
  File "quart/app.py", line 1718, in dispatch_request
  File "mimic3_http/app.py", line 215, in app_tts
  File "mimic3_http/app.py", line 103, in text_to_wav
  File "mimic3_http/synthesis.py", line 125, in do_synthesis_proc
  File "mimic3_http/synthesis.py", line 81, in do_synthesis
  File "mimic3_http/synthesis.py", line 61, in do_synthesis
  File "mimic3_tts/tts.py", line 368, in speak_text
  File "mimic3_tts/tts.py", line 569, in _get_or_load_voice
  File "mimic3_tts/tts.py", line 621, in _download_voice
  File "mimic3_tts/download.py", line 121, in download_voice
  File "urllib/request.py", line 214, in urlopen
  File "urllib/request.py", line 517, in open
  File "urllib/request.py", line 534, in _open
  File "urllib/request.py", line 494, in _call_chain
  File "urllib/request.py", line 1389, in https_open
  File "urllib/request.py", line 1349, in do_open
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)>

Steps to reproduce the behavior:

Start mimic3-server and open http://localhost:59125
Click on 'speak' without changing the default language or default text
See error URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)>

Expected behavior
The text in the input field should be spoken in the selected (i.e. default) language.

Environment (please complete the following information):

Device type: desktop
OS: Manjaro Linux
Mycroft-core version: not installed
Other versions: Python 3.11.3-2

Additional context
No modifications to source code or config files.

If I change in the web-interface the language to English (UK), it is spoken as it is supposed to be.

mimic3 repeats every utterance twice

A couple of us humble users found a bug in the spelling skill and we're wondering if it is actually a bug in mimic 3. Would you mind taking a look?

MycroftAI/skill-spelling#30

Could not find a version that satisfies the requirement mimic3-tts[de]

When I tried to install the mycroft-plugin-tts-mimic3[all] Python library, I got this error:

"#16 85.10 ERROR: Could not find a version that satisfies the requirement mimic3-tts[de] (from mycroft-plugin-tts-mimic3[all]) (from versions: none)",
"#16 85.11 ERROR: No matching distribution found for mimic3-tts[de]",

Strange soundings when using german voices installed via pip

I installed idt as follows :
$ pip install mycroft-mimic3-tts
It works after then with :
$ mimic3 -v="en_US" hello
But if I try to use a german voice :
$ mimic3 -v="de_DE" "Hallo"
THe following output is given :

Reading text from stdin...
INFO:mimic3_tts.tts:Loaded voice from /home/joshua/.local/share/mycroft/mimic3/voices/de_DE/thorsten_low
Traceback (most recent call last):
File "/home/joshua/.local/bin/mimic3", line 8, in
sys.exit(main())
File "/home/joshua/.local/lib/python3.8/site-packages/mimic3_tts/main.py", line 129, in main
process_lines(state)
File "/home/joshua/.local/lib/python3.8/site-packages/mimic3_tts/main.py", line 450, in process_lines
process_line(line, state, line_id=line_id, line_voice=line_voice)
File "/home/joshua/.local/lib/python3.8/site-packages/mimic3_tts/main.py", line 379, in process_line
state.tts.speak_text(line)
File "/home/joshua/.local/lib/python3.8/site-packages/mimic3_tts/tts.py", line 380, in speak_text
for sent_phonemes, break_type in voice.text_to_phonemes(
File "/home/joshua/.local/lib/python3.8/site-packages/mimic3_tts/voice.py", line 420, in text_to_phonemes
for sentence in gruut.sentences(text, lang=text_language):
File "/home/joshua/.local/lib/python3.8/site-packages/gruut/init.py", line 79, in sentences
graph, root = text_processor(text, lang=lang, ssml=ssml, **process_args)
File "/home/joshua/.local/lib/python3.8/site-packages/gruut/text_processor.py", line 441, in call
return self.process(*args, **kwargs)
File "/home/joshua/.local/lib/python3.8/site-packages/gruut/text_processor.py", line 690, in process
self._pipeline_tokenize(
File "/home/joshua/.local/lib/python3.8/site-packages/gruut/text_processor.py", line 1618, in _pipeline_tokenize
in_lexicon = self._is_word_in_lexicon(word_text_norm, settings)
File "/home/joshua/.local/lib/python3.8/site-packages/gruut/text_processor.py", line 2097, in _is_word_in_lexicon
return bool(settings.lookup_phonemes(word, do_transforms=False))
File "/home/joshua/.local/lib/python3.8/site-packages/gruut/lang.py", line 900, in call
return self.phonemizer(word, role=role, do_transforms=do_transforms)
File "/home/joshua/.local/lib/python3.8/site-packages/gruut/phonemize.py", line 91, in call
cursor = self.db_conn.execute(
sqlite3.OperationalError: no such column: role
screen joshua@joshua-HP-Spectre-x360-Convertible-13-ac0XX: % CPU 12:50:11
~%

Streaming audio output

Is your feature request related to a problem? Please describe.
I'm always frustrated when I try to convert a long text to speech and I have to either use interactive mode (which, in order to play on any system other than the mimic3 server computer, leaves me with a bunch of files) or I have to just wait until the entire thing is converted (like when using curl to post the text to the api and listening to the resulting download.

Describe the solution you'd like
When using curl to download audio from the api, this audio should start immediately instead of only starting when the entire text is completed.

Describe alternatives you've considered
I tried using the mimic3 command directly, but it seems to do the same--it works until it has finished the whole text, and then outputs the entire audio. If you use --interactive mode it outputs audio after every sentence, which might be fine if I were using the server computer mimic3 is running on to actually listen to the audio, but I would like to be able to listen to the audio on a different machine (e.g., my phone), in close to real time.

Additional context
If the speech audio immediately started coming from the api when sending it a post request with the text to convert to speech, curl could be used to stream the audio and pipe it to some other app to play it in real time. This would be much easier than trying to deal with all the files created by --interactive mode, and as far as I can tell interactive mode isn't available anyway with the api (although since the api documentation page is broken it's hard to know for sure).

Thanks so much! The voices sound amazing.

Does not install on 32bit raspberry pi, onnxruntime cannot be installed.

Describe the bug

mycroft-pip install mycroft-plugin-tts-mimic3[all] fails to install on a raspberry pi 3 or 4 running 32 bit raspbian.

ERROR: Could not find a version that satisfies the requirement onnxruntime
ERROR: No matching distribution found for onnxruntime

It seems pypi has no pre-built onnxruntime for 32bit ARM. Browsing around the onnx site, it seems it is not supported.

To Reproduce
Steps to reproduce the behavior:

Install 32bit raspbian
mycroft-pip install mycroft-plugin-tts-mimic3[all]

Expected behavior
The documentation says it should work on 32bit armv7l.

Environment (please complete the following information):

Device type: RPI 3 B and RPI 4
OS: raspbian bullseye
Mycroft-core version: latest master

Returning the wrong slashes for url

mimic3/mimic3_tts/tts.py

Line 565 in 15b83ea

maybe_model_dir = Path(maybe_voice.location)

Not sure why, might be a version issue but Path is returning for example http:\github.com
Is there a switch or config that makes this happen?

Install not working with python 3.11.3

Describe the bug
using python 3.11.3 version make install can not fulfill the

To Reproduce
install python 3.11.3
make install

Expected behavior
install should work

Log files

./install.sh
Creating virtual environment at /../mimic3/.venv (Python 3.11.3)
Installing Python dependencies
Looking in links: /../mimic3/wheels, https://synesthesiam.github.io/prebuilt-apps/
Requirement already satisfied: pip in ./.venv/lib/python3.11/site-packages (22.3.1)
Collecting pip
  Using cached pip-23.1.2-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 22.3.1
    Uninstalling pip-22.3.1:
      Successfully uninstalled pip-22.3.1
Successfully installed pip-23.1.2
Looking in links: /home/cseeger/Others/git/AI/mimic3/wheels, https://synesthesiam.github.io/prebuilt-apps/
Collecting wheel
  Using cached wheel-0.40.0-py3-none-any.whl (64 kB)
Requirement already satisfied: setuptools in ./.venv/lib/python3.11/site-packages (65.5.0)
Collecting setuptools
  Using cached setuptools-67.7.2-py3-none-any.whl (1.1 MB)
Installing collected packages: wheel, setuptools
  Attempting uninstall: setuptools
    Found existing installation: setuptools 65.5.0
    Uninstalling setuptools-65.5.0:
      Successfully uninstalled setuptools-65.5.0
Successfully installed setuptools-67.7.2 wheel-0.40.0
~/Others/git/AI/mimic3 ~/Others/git/AI/mimic3
Looking in links: /home/cseeger/Others/git/AI/mimic3/wheels, https://synesthesiam.github.io/prebuilt-apps/
Obtaining file:///home/cseeger/Others/git/AI/mimic3
  Preparing metadata (setup.py) ... done
Collecting dataclasses-json<1.0 (from mycroft-mimic3-tts==0.2.5)
  Using cached dataclasses_json-0.5.7-py3-none-any.whl (25 kB)
Collecting epitran==1.17 (from mycroft-mimic3-tts==0.2.5)
  Using cached epitran-1.17-py2.py3-none-any.whl (153 kB)
Collecting espeak-phonemizer<2.0,>=1.0 (from mycroft-mimic3-tts==0.2.5)
  Using cached espeak_phonemizer-1.3.0.tar.gz (18 kB)
  Preparing metadata (setup.py) ... done
Collecting gruut<3.0,>=2.3.0 (from mycroft-mimic3-tts==0.2.5)
  Using cached gruut-2.3.4.tar.gz (74 kB)
  Preparing metadata (setup.py) ... done
Collecting numpy<2.0 (from mycroft-mimic3-tts==0.2.5)
  Using cached numpy-1.24.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
INFO: pip is looking at multiple versions of mycroft-mimic3-tts[all] to determine which version is compatible with other requirements. This could take a while.
ERROR: Ignored the following versions that require a different python version: 1.21.2 Requires-Python >=3.7,<3.11; 1.21.3 Requires-Python >=3.7,<3.11; 1.21.4 Requires-Python >=3.7,<3.11; 1.21.5 Requires-Python >=3.7,<3.11; 1.21.6 Requires-Python >=3.7,<3.11
ERROR: Could not find a version that satisfies the requirement onnxruntime<2.0,>=1.6 (from mycroft-mimic3-tts[all]) (from versions: none)
ERROR: No matching distribution found for onnxruntime<2.0,>=1.6
make: *** [Makefile:32: install] Error 1

Environment (please complete the following information):

Device type: desktop
OS: archlinux
Mycroft-core version: current
Other versions: python 3.11.3

Home Assistant compatibility

Hey there

Attempting to add Mimic 3 (docker container) as a drop-in replacement for MaryTTS is not working:

I have added the following to the configuration.yaml:

tts:
  - platform: marytts
    host: "localhost"
    port: 59125
    language: "en_UK"
    voice: apope_low

When checking the configuration's validity, the following error is thrown:

If I swap en_UK to en_GB, this check passes:

So, is this a Home Assistant or Mimic 3 issue? Can I manually rename the en_UK voice to en_GB?

[SSML] Wrong settings used for speech synthesis

Describe the bug
Mimic3 ignores <prosody> settings and instead applies the settings of the last closed <prosody> block instead.

To Reproduce
mimic3 '<prosody rate="200%">This should be spoken fast but is not.</prosody><prosody volume="30%">This should be a bit quieter but is actually spoken faster</prosody>' --ssml | aplay

Expected behavior
Mimic3 should speak the first sentence fast and the second one with lowered volume.

Environment
- Device type: desktop
- OS: Ubuntu 22.04

Source of actual behavior

mimic3/mimic3_tts/tts.py

Lines 470 to 501 in be72c18

 def end_utterance(self) -> typing.Iterable[BaseResult]: 

 last_settings: typing.Optional[Mimic3Settings] = None 

 sent_phonemes: PHONEMES_LIST_TYPE = [] 

 for result in self._results: 

 if isinstance(result, Mimic3Phonemes): 

 if result.is_utterance: 

 # Utterance boundary 

 if ( 

 sent_phonemes 

 and (last_settings is not None) 

 and (result.current_settings != last_settings) 

 ): 

 # Not compatible with existing utterance. 

 # Need to speak previous utterance first. 

 yield self._speak_sentence_phonemes( 

 sent_phonemes, settings=last_settings 

 ) 

 sent_phonemes.clear() 

 # Current utterance 

 sent_phonemes.extend(result.phonemes) 

 if sent_phonemes: 

 yield self._speak_sentence_phonemes( 

 sent_phonemes, settings=last_settings 

 ) 

 sent_phonemes.clear() 

 else: 

 # Continue until utterance boundary 

 sent_phonemes.extend(result.phonemes) 

 last_settings = result.current_settings

long pause

Is your feature request related to a problem? Please describe.
Is it possible to add a mark in the text file to create a long pause : 10 seconds or more

Describe the solution you'd like
to be able to put in the text file [[10 second pause]] and mimic3 would use this indication to pause for 10 seconds

Describe alternatives you've considered
use the number of blank lines to set the pause length

Additional context
The idea is to genetare multiple question choice for text and give the answer after a pause/break

TTS. Last letter of the text won't be spoken in spanish

Hello,

I installed a fresh Mimic3 from docker.

I tried some languages, english, german, spanish, french, etc

All languages work quite well, except spanish.

There is a bug with spanish speakers. When I write a text, the last letter won't be pronounced completely. For example, if I write "Buenos días" when inference the TTS it will say "Buenos di". Another example, "estoy en la playa" will pronounce "estoy en la pla".

Do you know how to fix that?

Thank you!

David

SSML doesn't work as expected

Describe the bug
SSML doesn't work as expected

To Reproduce
Steps to reproduce the behavior:

Click 'Enabled SSML'
Paste the following:

<speak>
<prosody rate="10%">
The weather today is rainy and clouds.
With a current temperature of 7 degrees, with it feeling like 7 degrees.
The highest temperature will be 9 and a lowest of 5.
Have a good day.
</prosody>
</speak>

<speak>
<prosody rate="10%">
<break time="1000ms"/>
The weather today is rainy and clouds.
With a current temperature of 7 degrees, with it feeling like 7 degrees.
The highest temperature will be 9 and a lowest of 5.
Have a good day.
</prosody>
</speak>

Expected behavior
Everything should be at 10% speed.

Log files
There are no logs

Environment (please complete the following information):

Device type: Kubernetes
OS: Docker
Mycroft-core version: mycroftai/mimic3:0.2.4

Additional context
Seems to happen with any voice. It's like the prosody parameter is ignored for the first sentence. If I tell it to use a different voice from the begging, the first sentence will be in the default voice before it switches. It does this when running on Home Assistant, or when running in Docker on my main computer.

Allow unescaped `#` in web API for choosing voice speaker

Using the web server POST API, I found voice key selection a little confusing. I could enter a speakerless key like en_US/vctk_low without escaping the /, which made me think the # in en_US/vctk_low#p238 wouldn't need to be URI encoded as %23 either. I expected the API to parse the fragment and convert it to a speaker. Not so, it simply ignores the fragment, which puzzled me for some time figuring out how to set a speaker.

I would like the API to parse any fragments and append them to the last query with a hash sign. This would make it consistent with not needing to escape the /.

The current workaround is simply to use %23 as an escaped replacement for a hash in POST requests. It makes sense for a web API to require this, but it made my user experience worse due to confusion with the /.
Another aid could be to give an example in the documentation where the curl query has a speaker set, which would include the escaped #.

espeak-phonemizer package dependency causes pip install to fail

C:\Users\user\Projects\tts> pip install mycroft-mimic3-tts[all]

Collecting mycroft-mimic3-tts[all]
Using cached mycroft_mimic3_tts-0.2.4.tar.gz (131 kB)
Preparing metadata (setup.py) ... done
Collecting dataclasses-json<1.0
Using cached dataclasses_json-0.5.7-py3-none-any.whl (25 kB)
Collecting epitran==1.17
Using cached epitran-1.17-py2.py3-none-any.whl (153 kB)
Collecting espeak-phonemizer<2.0,>=1.0
Using cached espeak_phonemizer-1.1.0.tar.gz (18 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [10 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "C:\Users\user\AppData\Local\Temp\pip-install-so_mcy43\espeak-phonemizer_62f9d420053a418097eaf7bbfc7bd665\setup.py", line 16, in
long_description = readme_path.read_text()
File "C:\Python310\lib\pathlib.py", line 1135, in read_text
return f.read()
File "C:\Python310\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 707: character maps to
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details

Python install and sourse install fail at - satisfies the requirement onnxruntime<2.0,>=1.6

Installation fails at, installing onnxruntime
This happens both on the python install and the source install.
Using an up to date Arch Linux install on an AMD64 desktop computer.

ERROR: Could not find a version that satisfies the requirement onnxruntime<2.0,>=1.6 (from mycroft-mimic3-tts[all]) (from versions: none)
ERROR: No matching distribution found for onnxruntime<2.0,>=1.6

(I have tried to manually install onnxruntime without success, I get the same versioning issue).

Only word choice affects whether a question is spoken like a question.

Describe the bug
I had initially tried to have it say simple things like, "Hello. Hello? I did that. I did? No? No. No?!" and so on, to try to hear variation in how text was spoken; like, a ramp toward a slightly higher pitch toward the end of a sentence, that sort of thing. However, I heard practically no variation in the default en_UK/apope_low voice no matter what I tried, so assumed it just couldn't make different sounds for the same words.

However, after discovering that the non-default voices would always speak slightly differently every time I'd have them say the same thing, I started doing more testing.. And found that even with --noise-scale 0 --noise-w 0 I could get these voices to have things like that ramped up pitch at the end if I unambiguously worded a sentence like a question to begin with.

This seems most consistent with the en_US/ljspeech_low voice. The others often do sound like they're saying a question, but it's ambiguous. This.. Works, but not well.

To Reproduce
Compare the output audio for the following commands:

mimic3 -v en_US/ljspeech_low --noise-scale 0 --noise-w 0 'Where was it?'
mimic3 -v en_US/ljspeech_low --noise-scale 0 --noise-w 0 'Where was it.'
mimic3 -v en_US/ljspeech_low --noise-scale 0 --noise-w 0 'That was it?'
mimic3 -v en_US/ljspeech_low --noise-scale 0 --noise-w 0 'That was it.'

Expected behavior
The 'was it' at the end of commands 1 and 3 above should be spoken as if they were questions. The 'was it' at the end of commands 2 and 4 should be spoken as if they were statements.

Instead, it speaks 1 and 2 completely identically, as if both of them are questions. It speaks 3 and 4 identically as well, but as if they are both statements.

Environment (please complete the following information):

Device type: Desktop
OS: KDE Neon (based on Ubuntu 22.04)
Mycroft-core version: I only have mycroft-mimic3-tts, version 0.2.4.

[SSML] Line break and white space causing artefacts before </s> tag

Describe the bug
When Mimic 3 plays SSML file, line break and white space at the end of spoken text before closing tag causes artefacts and noise to speech. I first noticed this when I tested SSML sample file found on page https://mycroft-ai.gitbook.io/docs/mycroft-technologies/mimic-tts/mimic-3.

To Reproduce
mimic3_ssml_cracks_and_noise.zip
Steps to reproduce the behavior:

Unzip attached file.
Play SSML file interactively: mimic3 --ssml --interactive < mimic3_ssml_cracks_and_noise.ssml
Alternatively play attached mimic3_ssml_cracks_and_noise.wav file.
Note noise and artefacts and compare what your hear to how its is presented in SSML file.

Expected behavior
There should not be any noise artefacts after spoken text in any of the cases.

Log files
n/a

Environment (please complete the following information):

Desktop PC (AMD Ryzen Threadripper 2950X CPU), laptop (Intel i7-10510U CPU)
OS: Opensuse Tumbleweed
Mycrofr-core version

$ zypper info mycroft-core 
Information for package mycroft-core:
-------------------------------------
Repository     : Main Repository (OSS) (20230202)
Name           : mycroft-core
Version        : 18.8.13-1.19
Arch           : noarch
Vendor         : openSUSE
Installed Size : 14.5 MiB
Installed      : Yes
Status         : up-to-date
Source package : mycroft-core-18.8.13-1.19.src
Upstream URL   : https://mycroft.ai
Summary        : The Mycroft Artificial Intelligence platform
Description    : 
    Mycroft is a voice assistant.

Python3 is python 3.10.9

I have not touched config files at all in neither of the machines.

🗣💬 SSML / unexpected vocalizations

BUG
When using SSML, at the end of each sentence you can hear some unexpected and unwanted vocalizations.

AUDIO EXAMPLE
https://drive.google.com/file/d/1iLbnOksA4Avtq021Wf6Z50m1yUNPHcIx/view

mimic3-server: OpenAPI page does not work (Failed to load API definition)

Describe the bug
The OpenAPI page fails to load with the message "Failed to load API defintion". It also displays the following error:

Fetch error
response status is 500 /openapi/swagger.json

Additionally, the following error appears in the mimic3-server log output:

ERROR:mimic3_http.app:
Traceback (most recent call last):
  File "/var/lib/mimic3/venv/lib/python3.9/site-packages/quart/app.py", line 1512, in full_dispatch_request
    result = await self.dispatch_request(request_context)
  File "/var/lib/mimic3/venv/lib/python3.9/site-packages/quart/app.py", line 1557, in dispatch_request
    return await self.ensure_async(handler)(**request_.view_args)
  File "/var/lib/mimic3/venv/lib/python3.9/site-packages/swagger_ui/handlers/quart.py", line 27, in swagger_blueprint_config_handler
    return jsonify(doc.get_config(request.host))
  File "/var/lib/mimic3/venv/lib/python3.9/site-packages/swagger_ui/core.py", line 130, in get_config
    assert Path(self.config_path).is_file()
AssertionError

This is most likely unrelated to #5, as this issue occurs when accessing the app directly (with default settings, not through a reverse proxy or similar).

To reproduce
Steps to reproduce the behavior:

Start a mimic3-server instance with the default settings.
Navigate to http://localhost:59125/openapi/.
The error appears and the page fails to load.

Environment:

Device type: x86_64, aarch64
OS: Arch Linux, Arch Linux ARM
httpd 2.4.54
Python 3.9

AttributeError: 'Num2Word_IT' object has no attribute 'to_year'

Describe the bug
Mimic3 works on Docker with italian voices but it seems to crash with years

To Reproduce
Steps to reproduce the behavior:

type: "Fino a quando, dopo quattro anni di relazione, dal 2016 al 2020, e un anno di convivenza, la donna aveva scoperto per caso la malattia del compagno."
pick Italian

Expected behavior
it should read

Log files

ERROR:mimic3_http.app:'Num2Word_IT' object has no attribute 'to_year'
Traceback (most recent call last):
  File "/home/mimic3/app/.venv/lib/python3.9/site-packages/quart/app.py", line 1673, in full_dispatch_request
    result = await self.dispatch_request(request_context)
  File "/home/mimic3/app/.venv/lib/python3.9/site-packages/quart/app.py", line 1718, in dispatch_request
    return await self.ensure_async(handler)(**request_.view_args)
  File "/home/mimic3/app/mimic3_http/app.py", line 215, in app_tts
    wav_bytes = await text_to_wav(
  File "/home/mimic3/app/mimic3_http/app.py", line 103, in text_to_wav
    wav_bytes = await future
  File "/home/mimic3/app/mimic3_http/synthesis.py", line 125, in do_synthesis_proc
    result = do_synthesis(item, mimic3)
  File "/home/mimic3/app/mimic3_http/synthesis.py", line 81, in do_synthesis
    raise e
  File "/home/mimic3/app/mimic3_http/synthesis.py", line 61, in do_synthesis
    mimic3.speak_text(params.text, text_language=params.text_language)
  File "/home/mimic3/app/mimic3_tts/tts.py", line 380, in speak_text
    for sent_phonemes, break_type in voice.text_to_phonemes(
  File "/home/mimic3/app/mimic3_tts/voice.py", line 420, in text_to_phonemes
    for sentence in gruut.sentences(text, lang=text_language):
  File "/home/mimic3/app/.venv/lib/python3.9/site-packages/gruut/__init__.py", line 79, in sentences
    graph, root = text_processor(text, lang=lang, ssml=ssml, **process_args)
  File "/home/mimic3/app/.venv/lib/python3.9/site-packages/gruut/text_processor.py", line 441, in __call__
    return self.process(*args, **kwargs)
  File "/home/mimic3/app/.venv/lib/python3.9/site-packages/gruut/text_processor.py", line 1079, in process
    if pipeline_transform(self._verbalize_number, graph, root):
  File "/home/mimic3/app/.venv/lib/python3.9/site-packages/gruut/utils.py", line 351, in pipeline_transform
    if transform_func(graph, leaf_node):
  File "/home/mimic3/app/.venv/lib/python3.9/site-packages/gruut/text_processor.py", line 2146, in _verbalize_number
    num_str = num2words(final_num, **num2words_kwargs)
  File "/home/mimic3/app/.venv/lib/python3.9/site-packages/num2words/__init__.py", line 96, in num2words
    return getattr(converter, "to_{}".format(to))(number, **kwargs)
AttributeError: 'Num2Word_IT' object has no attribute 'to_year'

Dependency on xdgenvpy which has disappeared upstream

The xdgenvpy package on PyPi links to a Gitlab repo that doesn't exist any more. I can't find an alternative repo for it anywhere on the internet either, so I'm assuming it's not maintained any more. Furthermore a source install using the tarball on PyPi is not possible out of the box due to it referencing in setup.py but not including a requirements.txt in the package.

It would be great if this package can be moved away from.

Getting 404 errors

Describe the bug
It errors when clicking speak.

To Reproduce
Steps to reproduce the behavior:

Install with Docker or Kubernetes.
Click "Speak"

Expected behavior
It should speak

Log files

Traceback (most recent call last):
  File "/home/mimic3/app/mimic3_http/synthesis.py", line 125, in do_synthesis_proc
    result = do_synthesis(item, mimic3)
  File "/home/mimic3/app/mimic3_http/synthesis.py", line 81, in do_synthesis
    raise e
  File "/home/mimic3/app/mimic3_http/synthesis.py", line 61, in do_synthesis
    mimic3.speak_text(params.text, text_language=params.text_language)
  File "/home/mimic3/app/mimic3_tts/tts.py", line 368, in speak_text
    voice = self._get_or_load_voice(self.voice)
  File "/home/mimic3/app/mimic3_tts/tts.py", line 569, in _get_or_load_voice
    maybe_model_dir = self._download_voice(voice_key)
  File "/home/mimic3/app/mimic3_tts/tts.py", line 621, in _download_voice
    download_voice(
  File "/home/mimic3/app/mimic3_tts/download.py", line 140, in download_voice
    raise VoiceDownloadError(
mimic3_tts.download.VoiceDownloadError: Failed to download file for voice af_ZA/google-nwu_low from https://github.com/MycroftAI/mimic3-voices/raw/master/voices/af_ZA/google-nwu_low/generator.onnx: HTTP Error 404: Not Found
ERROR:mimic3_http.app:Failed to download file for voice af_ZA/google-nwu_low from https://github.com/MycroftAI/mimic3-voices/raw/master/voices/af_ZA/google-nwu_low/generator.onnx: HTTP Error 404: Not Found
Traceback (most recent call last):
  File "/home/mimic3/app/mimic3_tts/download.py", line 121, in download_voice
    with urllib.request.urlopen(file_url) as response:
  File "/usr/lib/python3.9/urllib/request.py", line 214, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.9/urllib/request.py", line 523, in open
    response = meth(req, response)
  File "/usr/lib/python3.9/urllib/request.py", line 632, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.9/urllib/request.py", line 555, in error
    result = self._call_chain(*args)
  File "/usr/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.9/urllib/request.py", line 747, in http_error_302
    return self.parent.open(new, timeout=req.timeout)
  File "/usr/lib/python3.9/urllib/request.py", line 523, in open
    response = meth(req, response)
  File "/usr/lib/python3.9/urllib/request.py", line 632, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.9/urllib/request.py", line 561, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.9/urllib/request.py", line 641, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/mimic3/app/.venv/lib/python3.9/site-packages/quart/app.py", line 1673, in full_dispatch_request
    result = await self.dispatch_request(request_context)
  File "/home/mimic3/app/.venv/lib/python3.9/site-packages/quart/app.py", line 1718, in dispatch_request
    return await self.ensure_async(handler)(**request_.view_args)
  File "/home/mimic3/app/mimic3_http/app.py", line 215, in app_tts
    wav_bytes = await text_to_wav(
  File "/home/mimic3/app/mimic3_http/app.py", line 103, in text_to_wav
    wav_bytes = await future
  File "/home/mimic3/app/mimic3_http/synthesis.py", line 125, in do_synthesis_proc
    result = do_synthesis(item, mimic3)
  File "/home/mimic3/app/mimic3_http/synthesis.py", line 81, in do_synthesis
    raise e
  File "/home/mimic3/app/mimic3_http/synthesis.py", line 61, in do_synthesis
    mimic3.speak_text(params.text, text_language=params.text_language)
  File "/home/mimic3/app/mimic3_tts/tts.py", line 368, in speak_text
    voice = self._get_or_load_voice(self.voice)
  File "/home/mimic3/app/mimic3_tts/tts.py", line 569, in _get_or_load_voice
    maybe_model_dir = self._download_voice(voice_key)
  File "/home/mimic3/app/mimic3_tts/tts.py", line 621, in _download_voice
    download_voice(
  File "/home/mimic3/app/mimic3_tts/download.py", line 140, in download_voice
    raise VoiceDownloadError(
mimic3_tts.download.VoiceDownloadError: Failed to download file for voice af_ZA/google-nwu_low from https://github.com/MycroftAI/mimic3-voices/raw/master/voices/af_ZA/google-nwu_low/generator.onnx: HTTP Error 404: Not Found

You may also include screenshots, however screenshots of log files are often difficult to read and parse.

Environment (please complete the following information):

Device type: Docker and Kubernetes
OS: Debian
Mycroft-core version: [e.g. 20.08]
Other versions:

Additional context
I'm using docker image 0.2.4, which should be the latest one. If I manually go to the github page for the file in question and click "raw" to download manually, it also 404s: https://github.com/MycroftAI/mimic3-voices/blob/master/voices/af_ZA/google-nwu_low/generator.onnx

It seems to be doing this for every voice I test. Is this a github issue?

Docker container is missing aplay for local playback

Describe the bug
Docker container is missing aplay for local playback

To Reproduce
Steps to reproduce the behavior:
0. Run docker container as described in de README.md

Go to 'Webpage'
Click on 'Play audio on'
Set to 'Server'
Click on 'Speak'
See error in logs 'ERROR:mimic3_http.app:[Errno 2] No such file or directory: 'aplay' and on screen.

Expected behavior
Sound!

Environment (please complete the following information):

Device type: x86
OS: Linux
Mimic3 latest (https://github.com/MycroftAI/mimic3/tree/be72c185e471e3ef939147679df9e1d00262c513)

Fix

Fix is add --device /dev/snd to the docker run command e.g.

docker run \
       -it --device /dev/snd \
       -p 59125:59125 \
       -v "/docker/mimic3:/home/mimic3/.local/share/mycroft/mimic3" \
       'mycroftai/mimic3'

And inside the docker container install and enable audio.
Run as root in shell the following commands:

mkdir -p /var/cache/apt/amd64/archives/partial
apt install -y --no-install-recommends libasound2 libasound2-plugins alsa-utils
usermod -a -G audio mimic3

Now I have sound. But all my changes will be gone with every docker update :(

[Request] Release mimic3 tts for Android via f-droid

Is your feature request related to a problem? Please describe.
There's no good foss tts for android, especially the one which supports multiple languages and is available via f-droid.

Describe the solution you'd like
I would love to see android version staring to get support and even pre-releases to be make available to f-droid, even third party repo especially at first would be great.

Describe alternatives you've considered
For some languages situtation is simply that bad that there aren't any alternatives available.

Additional context
Example when using current tts at navigation purposes, it sounds so annoyingit's hard to listen for some languages. Even tried to just not care of it at one trip, but even then wife started with angry voice! "Would you just shut that up!" So there really is issue there and your currennt demo voices are easily better as is now. So please, boost development of android version and release oneeven when it would be at pre-version at fisrt. This is somethingg which is needed for de-gogled androids for a long time by now, and by some languages the need is even bigger!

Mimic3 not working after reboot

I installed and built Mimic3 source on a Manjaro Linux machine. After installed the source, everything worked fine. After reboot, I get the following error when trying to use a simple command:

[tim@tim-pc ~]$ mimic3 "Hello World!"
Reading text from stdin...
INFO:mimic3_tts.tts:Loaded voice from /usr/share/mycroft/mimic3/voices/en_UK/apope_low
ALSA lib dlmisc.c:339:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_pulse.so (/usr/lib/x86_64-linux-gnu/alsa-lib/libasound_module_pcm_pulse.so: cannot open shared object file: No such file or directory)
aplay: main:831: audio open error: No such device or address
Traceback (most recent call last):
File "mimic3.py", line 40, in
File "mimic3_tts/main.py", line 129, in main
File "mimic3_tts/main.py", line 481, in process_lines
File "mimic3_tts/main.py", line 512, in play_wav_bytes
File "subprocess.py", line 424, in check_output
File "subprocess.py", line 528, in run
subprocess.CalledProcessError: Command '['aplay', '-q', '/tmp/tmpl417ya40.wav']' returned non-zero exit status 1.
[1624] Failed to execute script 'mimic3' due to unhandled exception!

I then created the folder and copied the file here: /usr/lib/x86_64-linux-gnu/alsa-lib/libasound_module_pcm_pulse.so, but it just tells me another module that's missing.

Clarify and supply access to project logo artwork

Is your feature request related to a problem? Please describe.

It is not clear which logo relates to the mimic3 project. Several options exist in the various websites and artwork.

Describe the solution you'd like

Supply access to high quality logo artwork (vector/svg) and ideally a series of transparent png at the typical resolutions

*Describe alternatives you've considered

I can guess at and extract what I believe is the logo from one of the images hosted here

Additional context

When self hosting mimic3 it is useful to have access to the logo for projects such as:

Mispronounciation of Finnish TTS sample

Describe the bug
At Finnish TTS sample there are some mispronuncations, basicle some words are pronoounced with missig last letter.

To Reproduce
Steps to reproduce the behavior:

Go to 'https://mycroft.ai/mimic-3/'
Select 'Suomi (Finnish)' on "Hear My Voices
Click 'Play' to hear TTS
Hear error 'missing ending letter for some words'

Expected behavior
Expected behavior would be to pronounce whole word, including last letters. Some words to mention.Word ilmiö is pronounced as ilmi. Word etupinnasta is pronounced as etupinnast (first time word appears at sentence, that last a is pronounced as supposed to, but missing when word appears second time. Word sateenkaaren at the end, is also pronounced bit like full or half cutted n at end, sounding like sateenkaare

Log files
If possible, add log files from /var/log/mycroft/ to help explain your problem.

Sorry no logs, cause just tested this sample TTS, which on the other hand should be fixed cause it will be the first thing many test to hear.

You may also include screenshots, however screenshots of log files are often difficult to read and parse.

If you are running Mycroft, the Support Skill helps to automate gathering this information. Simply say "Create a support ticket" and the Skill will put together a support package and email it to you.

Environment (please complete the following information):

Device type: [e.g. Raspberry Pi, Mark 1, desktop]
OS: [e.g. Ubuntu, Picroft]
Mycroft-core version: [e.g. 20.08]
Other versions: [e.g. Adapt v0.3.7]

Additional context
I did notice this in beta version too, but figured it would be ironed out at final release. It might not be as big problem at the end, and might also be related to selected sound and currently only one is available. But since it's the first thing to test for, I think it would be good idea to fix those.

With that said, I haven't yet test to install this TTS to my computer, so can't yet say is there similar problem in generally, or only at sample text.

Please think carefully about whether you have modified anything in Mycroft's code or configuration files. If so, can you reproduce this on a clean installation of Mycroft? Many "bugs" turn out to be non-standard configuration errors.

Pip dependency error on Ubuntu 22.04 and Python 3.10

Describe the bug
Just following the steps to install mimic3 as a command line tool:

$ sudo apt-get install libespeak-ng1
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
libespeak-ng1 is already the newest version (1.50+dfsg-10).
0 upgraded, 0 newly installed, 0 to remove and 14 not upgraded.

 $ python -m venv venv
$ . venv/bin/activate
$ pip install mycroft-mimic3-tts[all]
$ pip3 install --upgrade pip
Requirement already satisfied: pip in ./venv/lib/python3.10/site-packages (22.1.2)

$ pip install mycroft-mimic3-tts[all]
Collecting mycroft-mimic3-tts[all]
  Using cached mycroft_mimic3_tts-0.2.3.tar.gz (130 kB)
  Preparing metadata (setup.py) ... done
Collecting dataclasses-json<1.0
  Using cached dataclasses_json-0.5.7-py3-none-any.whl (25 kB)
Collecting epitran==1.17
  Using cached epitran-1.17-py2.py3-none-any.whl (153 kB)
Collecting espeak-phonemizer<2.0,>=1.0
  Using cached espeak_phonemizer-1.1.0.tar.gz (18 kB)
  Preparing metadata (setup.py) ... done
Collecting gruut<3.0,>=2.3.0
  Using cached gruut-2.3.4.tar.gz (74 kB)
  Preparing metadata (setup.py) ... done
Collecting numpy<2.0
  Using cached numpy-1.23.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.0 MB)
Collecting mycroft-mimic3-tts[all]
  Using cached mycroft_mimic3_tts-0.2.2.tar.gz (130 kB)
  Preparing metadata (setup.py) ... done
ERROR: Cannot install mycroft-mimic3-tts[all]==0.2.2 and mycroft-mimic3-tts[all]==0.2.3 because these package versions have conflicting dependencies.

The conflict is caused by:
    mycroft-mimic3-tts[all] 0.2.3 depends on onnxruntime<2.0 and >=1.6
    mycroft-mimic3-tts[all] 0.2.2 depends on onnxruntime<2.0 and >=1.6

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

My system is Ubuntu 22.04 64 bit (running on Xorg). Python 3.10.4.

Mimic does not work in in conda environment, venv, or just in general

I am trying to use mimc3 in python and I basically get the same results not matter what I try. I have the voices en_UK/apope_low and en_US/cmu-arctic_low in my directory ~/.local/share/mycroft/mimic3/voices. But mimic3 can't find them. So I leave Mimic3TTSPlugin() blank to get maybe some default voice but I get an error. For context I have tried installing mimic3 in different environments (conda, venv within conda, venv outside conda) and I've used python 3.9 and 3.12. Every thing gives the same results as below.

>>> from ovos_tts_plugin_mimic3 import Mimic3TTSPlugin
2024-05-21 20:02:34.068 - OVOS - ovos_utils.messagebus:<module>:9 - WARNING - Deprecation version=0.1.0. Caller=ovos_plugin_manager.templates.audio:9. decode_binary_message, send_binary_file_message, send_binary_data_message,     send_message, wait_for_reply, listen_once_for_message, get_message_lang, get_websocket, get_mycroft_bus,     listen_for_message have moved to ovos_bus_client.util
2024-05-21 20:02:34.094 - OVOS - ovos_utils.messagebus:<module>:12 - WARNING - Deprecation version=0.1.0. Caller=ovos_plugin_manager.templates.audio:9. dig_for_message, FakeMessage, FakeBus moved to ovos_utils.fakebus



>>> cfg = {"voice": "en_US/cmu-arctic_low"}


>>> mimic = Mimic3TTSPlugin(cfg)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/philip/miniconda3/envs/mimic3/lib/python3.9/site-packages/ovos_tts_plugin_mimic3/__init__.py", line 43, in __init__
    if lang not in self.default_voices:
TypeError: unhashable type: 'dict'



>>> mimic = Mimic3TTSPlugin()
2024-05-21 20:05:41.560 - OVOS - ovos_plugin_manager.utils:load_plugin:155 - WARNING - Could not find the plugin ovos.plugin.g2p.ovos-g2p-plugin-heuristic-arpa
2024-05-21 20:05:41.561 - OVOS - ovos_plugin_manager.g2p:create:142 - ERROR - The selected G2P plugin could not be loaded.
Traceback (most recent call last):
  File "/home/philip/miniconda3/envs/mimic3/lib/python3.9/site-packages/ovos_plugin_manager/g2p.py", line 139, in create
    g2p = clazz(g2p_config)
TypeError: 'NoneType' object is not callable
2024-05-21 20:05:41.562 - OVOS - ovos_plugin_manager.templates.tts:__init__:205 - ERROR - G2P plugin not loaded, there will be no mouth movements
Traceback (most recent call last):
  File "/home/philip/miniconda3/envs/mimic3/lib/python3.9/site-packages/ovos_plugin_manager/templates/tts.py", line 203, in __init__
    self.g2p = OVOSG2PFactory.create(cfg)
  File "/home/philip/miniconda3/envs/mimic3/lib/python3.9/site-packages/ovos_plugin_manager/g2p.py", line 139, in create
    g2p = clazz(g2p_config)
TypeError: 'NoneType' object is not callable
2024-05-21 20:05:41.575 - OVOS - ovos_plugin_manager.templates.tts:add_metric:277 - ERROR - 'Mimic3TTSPlugin' object has no attribute 'log_timestamps'
Traceback (most recent call last):
  File "/home/philip/miniconda3/envs/mimic3/lib/python3.9/site-packages/ovos_plugin_manager/templates/tts.py", line 274, in add_metric
    if self.log_timestamps:
AttributeError: 'Mimic3TTSPlugin' object has no attribute 'log_timestamps'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/philip/miniconda3/envs/mimic3/lib/python3.9/site-packages/ovos_tts_plugin_mimic3/__init__.py", line 94, in __init__
    self.tts.preload_voice(voice)
  File "/home/philip/miniconda3/envs/mimic3/lib/python3.9/site-packages/mimic3_tts/tts.py", line 310, in preload_voice
    self._get_or_load_voice(key_to_load)
  File "/home/philip/miniconda3/envs/mimic3/lib/python3.9/site-packages/mimic3_tts/tts.py", line 595, in _get_or_load_voice
    voice = Mimic3Voice.load_from_directory(
  File "/home/philip/miniconda3/envs/mimic3/lib/python3.9/site-packages/mimic3_tts/voice.py", line 283, in load_from_directory
    onnx_model = Mimic3Voice._load_model(
  File "/home/philip/miniconda3/envs/mimic3/lib/python3.9/site-packages/mimic3_tts/voice.py", line 403, in _load_model
    onnx_model = onnxruntime.InferenceSession(
  File "/home/philip/miniconda3/envs/mimic3/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 280, in __init__
    self._create_inference_session(providers, provider_options)
  File "/home/philip/miniconda3/envs/mimic3/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 307, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from /home/philip/.local/share/mycroft/mimic3/voices/en_US/cmu-arctic_low/generator.onnx failed:Protobuf parsing failed.

Encoding issue with mimic3-server (Latin-1 vs UTF-8)

Hi!

I have an encoding issue with mimic3-server:

$ mimic3 --remote --voice 'en_UK/apope_low' "I don’t speak English" | aplay --quiet
Reading text from stdin...
Traceback (most recent call last):
  File "mimic3.py", line 40, in <module>
  File "mimic3_tts/__main__.py", line 129, in main
  File "mimic3_tts/__main__.py", line 450, in process_lines
  File "mimic3_tts/__main__.py", line 397, in process_line
  File "mimic3_tts/__main__.py", line 587, in get_remote_wav_bytes
  File "requests/api.py", line 115, in post
  File "requests/api.py", line 59, in request
  File "requests/sessions.py", line 587, in request
  File "requests/sessions.py", line 701, in send
  File "requests/adapters.py", line 489, in send
  File "urllib3/connectionpool.py", line 703, in urlopen
  File "urllib3/connectionpool.py", line 398, in _make_request
  File "urllib3/connection.py", line 239, in request
  File "http/client.py", line 1255, in request
  File "http/client.py", line 1300, in _send_request
  File "http/client.py", line 164, in _encode
UnicodeEncodeError: 'latin-1' codec can't encode character '\u2019' in position 5: Body ('’') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8.
[582387] Failed to execute script 'mimic3' due to unhandled exception!
aplay: read_header:2931: erreur de lecture

However, there is no issue with mimic3:

$ mimic3 --voice 'en_UK/apope_low' "I don’t speak English" | aplay --quiet
Reading text from stdin...
INFO:mimic3_tts.tts:Loaded voice from /usr/share/mycroft/mimic3/voices/en_UK/apope_low

The error message states: “Use body.encode(‘utf-8’) if you want to send it encoded in UTF-8.” but I don’t know how do this. I simply run the server with the command:

$ mimic3-server --num-threads 6

I couldn’t find the option to tell the server that the input is utf-8 encoded. Here the versions of mimic3 and mimic3-server:

$ mimic3 --version
0.2.3
$ mimic3-server --version
0.1.1

Here are my locales and system:

$ env | grep LANG
LANG=fr_FR.utf8
GDM_LANG=fr_FR.utf8
$ lsb_release -a
LSB Version:    n/a
Distributor ID: Manjaro-ARM
Description:    Manjaro ARM Linux
Release:        23.02
Codename:       n/a

Here is a tip to get around the issue:

echo "I don’t speak English" | iconv -f UTF-8 -t ISO-8859-1//TRANSLIT | mimic3 --remote --voice 'en_UK/apope_low' | aplay --quiet

This converts UTF-8 strings to ISO-8859-1 (i.e. Latin-1) while attempting to transcribe unrecognized characters, like "’".

I think this is a bug, because mimic3-server should accept UTF-8 encoding, as mimic3 does without problem.

mimic3 Docker Image on Podman: PermissionError: [Errno 13] Permission denied: '/home/mimic3/.local/share/mycroft/mimic3/voices'

Describe the bug
When using the Docker image with Podman, I get a permisson error when opening the local web page an site is not working.

To Reproduce

Follow the instructions https://mycroft-ai.gitbook.io/docs/mycroft-technologies/mimic-tts/mimic-3#docker-image but with Podman
Launch container with Podman from CLI
2a .With local mount option
$ podman run -it -p 127.0.0.1:59125:59125 -v /home/user/.local/share/mycroft/mimic3:/home/mimic3/.local/share/mycroft/mimic3 mycroftai/mimic3
2b. OR without local mount
$ podman run -it -p 127.0.0.1:59125:59125 mycroftai/mimic3
Go to the local web page at http://localhost:59125/ and see the error message in the CLI

Expected behavior
No Error Message

Log files

$ podman run -it -p 127.0.0.1:59125:59125 -v /home/USER/.local/share/mycroft/mimic3:/home/mimic3/.local/share/mycroft/mimic3 mycroftai/mimic3
INFO:__main__:Starting web server
[2022-07-01 18:17:56 +0000] [1] [INFO] Running on http://0.0.0.0:59125 (CTRL + C to quit)
INFO:hypercorn.error:Running on http://0.0.0.0:59125 (CTRL + C to quit)
ERROR:mimic3_http.app:[Errno 13] Permission denied: '/home/mimic3/.local/share/mycroft/mimic3/voices'
Traceback (most recent call last):
  File "/home/mimic3/app/.venv/lib/python3.9/site-packages/quart/app.py", line 1512, in full_dispatch_request
    result = await self.dispatch_request(request_context)
  File "/home/mimic3/app/.venv/lib/python3.9/site-packages/quart/app.py", line 1557, in dispatch_request
    return await self.ensure_async(handler)(**request_.view_args)
  File "/home/mimic3/app/mimic3_http/app.py", line 241, in api_voices
    voices_by_key = {v.key: v for v in _MIMIC3.get_voices()}
  File "/home/mimic3/app/mimic3_http/app.py", line 241, in <dictcomp>
    voices_by_key = {v.key: v for v in _MIMIC3.get_voices()}
  File "/home/mimic3/app/mimic3_tts/tts.py", line 195, in get_voices
    for lang_dir in voices_dir.iterdir():
  File "/usr/lib/python3.9/pathlib.py", line 1149, in iterdir
    for name in self._accessor.listdir(self):
PermissionError: [Errno 13] Permission denied: '/home/mimic3/.local/share/mycroft/mimic3/voices'

Environment:

Device type: Laptop
OS: Fedora Silverblue 36
mycroftai/mimic3
Other versions: podman version 4.1.1

Additional context/ Workaround/ Fix:
I was able to fix the error by logging into the container as root, creating the directories, and setting the owner and group to mimic3

podman exec -u 0 -it sad_lalande bash
mkdir -p /home/mimic3/.local/share/mycroft/mimic3
chown -R mimic3:mimic3 .local

How to train my own models?

Hi! I've been watching your TTS systems for a while and appreciate your work! So thank you for open-sourcing your code, I genuinely believe that gruut has a great future and would become a standard in modern TTS systems.

I want to ask how to train new models for Mimic3? I like how English sounds, but Russian has a faulty alignment, and I want to address that. I can contribute to your code with an improved version of Russian TTS, but I can only find your GlowTTS training code for the larynx, not mimic3. Is it available, or is it only for personal use?

low-end GPU VRAM full causing onnxruntime Integer overflow

Describe the bug
I have tested GPU support on a laptop with geforce 840m 2GB VRAM. After 6-8 times mimic3 starts showing errors. After short investigation turned out GPU memory is full as application size in VRAM only grows.

To Reproduce
Steps to reproduce the behavior:

Run tests ony mimic3
Watch GPU VRAM goes up

Expected behavior
VRAM size should be considered, while app running.

Log files

onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_5953' Status Message: /onnxruntime_src/onnxruntime/core/common/safeint.h:17 static void SafeIntExceptionHandler<onnxruntime::OnnxRuntimeException>::SafeIntOnOverflow() Integer overflow

You may also include screenshots, however screenshots of log files are often difficult to read and parse.

If you are running Mycroft, the Support Skill helps to automate gathering this information. Simply say "Create a support ticket" and the Skill will put together a support package and email it to you.

Environment (please complete the following information):

Device type: Laptop CPU i3-4020U , gefore 840m 2GB, 8GB RAM
OS: Ubuntu 22.04
Mycroft-core version: docker-gpu
Other versions: git main as it was built

Additional context

Unable to install pip conflict.

Describe the bug
Unable to install using the pip instructions on ubuntu 18.04

To Reproduce
Steps to reproduce the behavior:

Follow the install instructions for command line
see error

Collecting mycroft-mimic3-tts
  Using cached mycroft_mimic3_tts-0.2.3.tar.gz (130 kB)
  Preparing metadata (setup.py) ... done
Collecting dataclasses-json<1.0
  Using cached dataclasses_json-0.5.7-py3-none-any.whl (25 kB)
Collecting epitran==1.17
  Using cached epitran-1.17-py2.py3-none-any.whl (153 kB)
Collecting espeak-phonemizer<2.0,>=1.0
  Using cached espeak_phonemizer-1.1.0.tar.gz (18 kB)
  Preparing metadata (setup.py) ... done
Collecting gruut<3.0,>=2.3.0
  Using cached gruut-2.3.4.tar.gz (74 kB)
  Preparing metadata (setup.py) ... done
Collecting numpy<2.0
  Using cached numpy-1.19.5-cp36-cp36m-manylinux2010_x86_64.whl (14.8 MB)
Collecting onnxruntime<2.0,>=1.6
  Using cached onnxruntime-1.10.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.9 MB)
Collecting phonemes2ids<2.0
  Using cached phonemes2ids-1.2.2.tar.gz (12 kB)
  Preparing metadata (setup.py) ... done
Collecting quart-cors
  Using cached Quart_CORS-0.1.3-py3-none-any.whl (6.3 kB)
Collecting mycroft-mimic3-tts
  Using cached mycroft_mimic3_tts-0.2.2.tar.gz (130 kB)
  Preparing metadata (setup.py) ... done
ERROR: Cannot install mycroft-mimic3-tts==0.2.2 and mycroft-mimic3-tts==0.2.3 because these package versions have conflicting dependencies.

The conflict is caused by:
    mycroft-mimic3-tts 0.2.3 depends on quart<1.0 and >=0.16
    mycroft-mimic3-tts 0.2.2 depends on quart<1.0 and >=0.16

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

Expected behavior
A clear and concise description of what you expected to happen.

Log files
If possible, add log files from /var/log/mycroft/ to help explain your problem.

You may also include screenshots, however screenshots of log files are often difficult to read and parse.

If you are running Mycroft, the Support Skill helps to automate gathering this information. Simply say "Create a support ticket" and the Skill will put together a support package and email it to you.

Environment (please complete the following information):

Device type: x86 laptop
OS: Ubuntu 18.04

mimic3-server: OpenAPI link and page do not work when app is not hosted at /

Describe the bug
On the Mimic 3 HTTP server index page, the link to the OpenAPI page refers to the absolute path /openapi/. When the app is being hosted at a location that is not the server root (through a reverse proxy, for example, in a subdirectory like /mimic3/), the link will point to the wrong location (e.g. /openapi/ instead of /mimic3/openapi/).

Furthermore, when manually navigating to the correct OpenAPI URL, the page will fail to load, likely because the client-side JS makes the same assumption that the app is always being hosted at /. The JS console reports:

Uncaught ReferenceError: SwaggerUIBundle is not defined
    onload https://example.com/mimic3/openapi/:40

The remaining functionality of the web app is not impacted as far as I can tell.

To reproduce
Steps to reproduce the behavior:

Start a mimic3-server instance.
Reverse-proxy the app to a location that is not /, for example /mimic3/.
Click on the "API" link next to "Docs".
The link will mistakenly lead to /openapi/ instead of /mimic3/openapi/.

Expected behavior
The link should be relative to the location of the app, and the OpenAPI page should work.

Environment

Device type: x86_64
OS: Arch Linux
Python 3.9
httpd 2.4.54

Get mimic3 to work nativly on windows

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
I want to be able to install mimic3 on my windows system to use as the text to speech engine

Additional context
I have started the process and am able to get it to run but the wave file it creates is not playable
Trying to figure out to debug that issue. espeak-ng works on windows

TypeError: Object of type set is not JSON serializable

hi
i install in window and ubuntu and when i have specified a voice i have the mesage ;

ERROR:mimic3_http.app:Object of type set is not JSON serializable Traceback (most recent call last): File "D:\Python\Python39\lib\site-packages\quart\app.py", line 1673, in full_dispatch_request result = await self.dispatch_request(request_context) File "D:\Python\Python39\lib\site-packages\quart\app.py", line 1718, in dispatch_request return await self.ensure_async(handler)(**request_.view_args) File "D:\Python\Python39\lib\site-packages\mycroft_mimic3_tts-0.2.3-py3.9.egg\mimic3_http\app.py", line 270, in api_voices return jsonify(voice_dicts) File "D:\Python\Python39\lib\site-packages\quart\json\__init__.py", line 32, in jsonify return current_app.json.response(*args, **kwargs) File "D:\Python\Python39\lib\site-packages\quart\json\provider.py", line 205, in response return self._app.response_class(self.dumps(object_, **dump_args), mimetype=self.mimetype) File "D:\Python\Python39\lib\site-packages\quart\json\provider.py", line 171, in dumps return json.dumps(object_, **kwargs) File "D:\Python\Python39\lib\json\__init__.py", line 234, in dumps return cls( File "D:\Python\Python39\lib\json\encoder.py", line 199, in encode chunks = self.iterencode(o, _one_shot=True) File "D:\Python\Python39\lib\json\encoder.py", line 257, in iterencode return _iterencode(o, 0) File "D:\Python\Python39\lib\site-packages\quart\json\provider.py", line 114, in _default raise TypeError(f"Object of type {type(object_).__name__} is not JSON serializable") TypeError: Object of type set is not JSON serializable

i have just a fonctional with api and my web interface is not fonctional

voices.json en_US/cmu-arctic_low metadata is incorrect

Describe the bug
Incorrect voices.json information: en_US/cmu-arctic_low/config.json filesize and hash are incorrect in voices.json. They are currently:

size_bytes: 3583
sha256_sum: 40d55cdef742aadff581b4418763db0b559e7c49d1602c39d552cfd8ae47c249

To Reproduce
Steps to reproduce the behavior:

In terminal, download voice by running:

mimic3 "Hello" --voice en_US/cmu-arctic_low

Use Mimic3 Python utilities to compare expected and actual sha256_sum and size_bytes:

import os
import mimic3_tts
print("Expected", mimic3_tts._resources._VOICES["en_US/cmu-arctic_low"]["files"]["config.json"])
file_path = os.path.join(mimic3_tts.download.DEFAULT_VOICES_DOWNLOAD_DIR, "en_US/cmu-arctic_low/config.json")
print("Actual size_bytes", os.path.getsize(file_path))
with open(file_path, "rb") as f:
  print("Actual sha256_sum", mimic3_tts.utils.file_sha256_sum(f))

Notice that expected does not match actual.

Expected behavior
Expected should match actual in step 2 above.

Log files
N/A

Environment (please complete the following information):
N/A

Additional context
N/A

Mimic 3 voices don't pause at em dashes

Describe the bug
Mimic 3 doesn't pause when it encounters an em dash in the text. An em dash is "—", different from a hyphen ("-")

To Reproduce
Steps to reproduce the behavior:
Create text that includes an em dash ("—") such as "Today—not tomorrow—is the day we have been waiting for"
Perform a TTS job with this text
The resulting speech does not pause at the location of the em dashes.

Expected behavior
There should be a small pause at an em dash, similar to a comma

Environment (please complete the following information):

Device type: Raspberry Pi 4
OS: Raspberry Pi Os, linux kernel 5.10.103-v8+
Mycroft-core version: latest docker version (mycroftai/mimic3)
Other versions:

Additional context
I'm only working in english, noticed this with the en_US/vctk_low#p276 voice but I think it's present in any voices I've tried.

Thanks!

	def end_utterance(self) -> typing.Iterable[BaseResult]:
	last_settings: typing.Optional[Mimic3Settings] = None
	sent_phonemes: PHONEMES_LIST_TYPE = []

	for result in self._results:
	if isinstance(result, Mimic3Phonemes):
	if result.is_utterance:
	# Utterance boundary
	if (
	sent_phonemes
	and (last_settings is not None)
	and (result.current_settings != last_settings)
	):
	# Not compatible with existing utterance.
	# Need to speak previous utterance first.
	yield self._speak_sentence_phonemes(
	sent_phonemes, settings=last_settings
	)
	sent_phonemes.clear()

	# Current utterance
	sent_phonemes.extend(result.phonemes)
	if sent_phonemes:
	yield self._speak_sentence_phonemes(
	sent_phonemes, settings=last_settings
	)
	sent_phonemes.clear()
	else:
	# Continue until utterance boundary
	sent_phonemes.extend(result.phonemes)

	last_settings = result.current_settings

mycroftai / mimic3 Goto Github PK

mimic3's Introduction

Mimic 3

Quickstart

Mycroft TTS Plugin

Web Server

Command-Line Tool

License

mimic3's People

Contributors

Stargazers

Watchers

Forkers

mimic3's Issues

Here are the key functionalities of the project:

Recommend Projects

Recommend Topics

Recommend Org