coqui-ai / stt-examples Goto Github PK

🐸STT integration examples

Home Page: https://github.com/coqui-ai/STT

License: Mozilla Public License 2.0

Kotlin 2.33% Python 29.78% PowerShell 0.31% JavaScript 27.85% HTML 8.92% Shell 2.03% C# 20.21% Nim 6.36% CSS 0.53% Dockerfile 0.70% Smarty 0.98%

stt-examples's Issues

django_streaming_api

i am getting the segmentation fault(core dumped) error when doing the live transcription through Websockets after approx 1 sec or less than this. I only got one word of transcription after that error comes
... sometimes I got the malloc error.......I don't know how to fix this....

Feature request: make web_microphone_websocket work outside localhost

The web_microphone_websocket example works for localhost, but on a domain there were CORS errors from the socket.io code, e.g. Chrome:

Access to XMLHttpRequest at 'https://example.com:4000/socket.io/?EIO=3&transport=polling&t=Nu-iT4E' from origin 'https://example.com.cz:3000' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.

Several recipes from SO and the socket.io docs did not work, perhaps someone with better JS knowledge might help.

Add a checklist for new examples

Some example projects are not listed under the appropriate section of the main README.

Notably for Android examples, the main readme points to a depreciated example while we have a working example not listed.

Adding the missing ones manually can solve the issue temporarily but a more permanent solution would be to make checklist in the template for PR or something.

Unreachable link in VAD Transcriber example

In the vad transcriber example readme the last link to https://mozilla-voice-stt.readthedocs.io/en/latest/Error-Codes.html return 404.

STT-examples/vad_transcriber/README.md

Line 106 in 6487e4f

 You can find a list of error codes and what they mean at https://mozilla-voice-stt.readthedocs.io/en/latest/Error-Codes.html 

Failed to resolve: ai.coqui:libstt:0.9.3

After checking out the project and opening it in Android Studio (4.1), running the app throws this error:

Could not resolve all files for configuration ':app:debugRuntimeClasspath'.
Could not find ai.coqui:libstt:0.9.3.
Searched in the following locations:
- https://dl.google.com/dl/android/maven2/ai/coqui/libstt/0.9.3/libstt-0.9.3.pom
- https://jcenter.bintray.com/ai/coqui/libstt/0.9.3/libstt-0.9.3.pom

Example using wasm on nodejs

Hi,

was curious if it was possible to use the WASM file on nodejs for unsupported nodejs versions. I know I can load a wasm file but I was hoping there might be an example since I am not sure the javascript in a browser is 1:1 with that of nodejs (like FileReader) or audioinput.

Thanks for any advice.

mic_vad_streaming.py freezes when trying to record 4 channels and convert to 1 channel.

def four_to_one(self, frame): #[ch1,ch2,ch3,ch4||ch1,ch2,ch3,ch4||ch1,ch2,ch3,ch4||ch1,ch2,ch3,ch4] frame = np.frombuffer(frame, np.int16) data = frame.reshape((self.CHANNELS,-1), order='F') b = 1/self.CHANNELS x = np.int16(0) for c in data: x+=c*b frame = (x.astype(np.int16)).tobytes() return frame
The above code is part of converting 4 channels frame to 1 channel. mic_vad_streaming.py file freezes when running on Raspberry and trying to record 4 channels. The function mentioned above is called inside the vad_collector function when length of the frame is larger than 2560.

Expand electron app to create a cross platform subtile creation tool

I also posted this in the Deep speech examples repo, but I believe my chances for a reaction are better in here:
mozilla/DeepSpeech-examples#143

Hey everyone, thanks a lot for your great work.
To make coqui more accessible to non-tech folks, it would be great to have a small desktop client. The electron example shows that this is doable and that a cross-platform app is relatively easy to create. I want to work on this, but I have very little experience with electron apps, so it might take a while and I might need some help.

I imagine a very minimalistic frontend that contains:

A input field for the model
A input field for the scorer
A input field for the wav-file
A progress bar
A way to show or save the generated srt file.

What are your thoughts on this? I want to start working on this during August and start by adapting the example app in a separated repo. Do you think this is a doable plan? I am also open for other proposals.

[Bug] streaming example expects protobuff file, not tflite

STT-examples/mic_vad_streaming/mic_vad_streaming.py

Line 158 in 32121db

ARGS.model = os.path.join(model_dir, 'output_graph.pb')

Update examples to use STT 1.3.0

For example:

pi@raspberrypi:~/Source/STT-examples $ git diff
diff --git a/mic_vad_streaming/requirements.txt b/mic_vad_streaming/requirements.txt
index e97d363..3eb12cc 100644
--- a/mic_vad_streaming/requirements.txt
+++ b/mic_vad_streaming/requirements.txt
@@ -1,7 +1,7 @@
-stt~=1.0.0
+stt~=1.3.0
 pyaudio~=0.2.11
 webrtcvad~=2.0.10
 halo~=0.0.18
 numpy>=1.15.1
 scipy>=1.1.0
-pyautogui~=0.9.52
\ No newline at end of file
+pyautogui~=0.9.52

Confirming licensing intent from contributors

Unfortunately we forgot to add a license file to this repo when we moved the examples to a separate repository. The main repo had the Mozilla Public License 2.0 and our intention was always to keep it as is. I'm tagging contributors to this repo to confirm that you agree to license your contributions as MPL-2.0. If you agree, please reply with a comment here saying "I agree to license my contributions to this repository under the Mozilla Public License 2.0."

mic_vad_streaming freezes after a wile.

Error when using 4 channels in mic_vad_streaming.py file.

I try to change channels number to 4 to work with 4 channels of audio with mic_vad_streaming.py file. I have respeaker 4 mic array kit. When I change line 20 to "CHANNELS = 4" and line 134 to "is_speech = self.vad.is_speech(frame[0::self.CHANNELS], self.sample_rate)" it starts endless collecting frame until reaches to limit and starts again. How to solve this problem?

Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA

I tried to run the python websocket example but I am getting an error.
I cloned the repo and first did

sudo docker build .

This succeeded. Then I did the following. It is possible I am misunderstanding the build process, but I am getting this error.

$ sudo docker container run b6898d9a294d
TensorFlow: v2.2.0-24-g1c1b2b9
DeepSpeech: v0.8.2-0-g02e4c76
Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage.
2021-11-21 23:05:17.440646: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Not found: /opt/deepspeech/model.tflite; No such file or directory
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/deepspeech/deepspeech_server/app.py", line 17, in <module>
    scorer_path=Path(conf["deepspeech.scorer"]).absolute().as_posix(),
  File "/opt/deepspeech/deepspeech_server/engine.py", line 30, in __init__
    self.model = Model(model_path=model_path)
  File "/usr/local/lib/python3.6/dist-packages/deepspeech/__init__.py", line 38, in __init__
    raise RuntimeError("CreateModel failed with '{}' (0x{:X})".format(deepspeech.impl.ErrorCodeToErrorMessage(status),status))
RuntimeError: CreateModel failed with 'Error reading the proto buffer model file.' (0x3005)

I was wondering if a cpu-architecture check might be useful to have?
I am a bit stuck on how to fix this issue given, to my understanding, coqui uses its own fork of tensorflow.
Running it without the container

python -m deepspeech_server.app

produces the same error
Thanks!

examples outdated

https://github.com/coqui-ai/STT-examples/tree/r0.9/vad_transcriber

e.g. contains link to https://github.com/coqui-ai/STT-examples/blob/doc/audioTranscript.png 404

and mozilla references

STT package NOT in npm,someone pls upload or help.

error log>npm ERR! code E404
npm ERR! 404 Not Found - GET https://registry.npmjs.org/STT - Not found
npm ERR! 404
npm ERR! 404 'STT@^1.3.0' is not in this registry.
npm ERR! 404 This package name is not valid, because
npm ERR! 404 1. name can no longer contain capital letters
npm ERR! 404
npm ERR! 404 Note that you can also install from a
npm ERR! 404 tarball, folder, http url, or git url.

.Net Framework example missing solution?

This is essentially a copy of
mozilla/DeepSpeech-examples#187

Where the problem seems the same.
Is there any support on this..?

coqui-ai / stt-examples Goto Github PK

stt-examples's People

Contributors

Stargazers

Watchers

Forkers

stt-examples's Issues

Recommend Projects

Recommend Topics

Recommend Org