Comments (11)
ok, it started working.
i used following to resolve it.
'''
sudo apt-get install libcudnn8
'''
from realtimestt.
Should only be a warning, not an error. The script should work nevertheless (tested on Firebox 130.0 and Chrome 128.0.6613.120, Windows 11 64-Bit).
If anybody knows how to get rid of that warning, I'm happy to hear ideas or take PRs.
from realtimestt.
could'nt resolve the following error:
Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory
(/home/azureuser/realtime_translator/venv) azureuser@VoiceCloning:/realtime_translator$ python server.py/realtime_translator$ nvcc --version
Starting server, please wait...
Initializing RealtimeSTT...
RealtimeSTT initialized
Server started. Press Ctrl+C to stop the server.
/home/azureuser/realtime_translator/server.py:109: DeprecationWarning: There is no current event loop
asyncio.get_event_loop().run_until_complete(start_server)
/home/azureuser/realtime_translator/server.py:110: DeprecationWarning: There is no current event loop
asyncio.get_event_loop().run_forever()
Client connected
Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory
Aborted (core dumped)
(/home/azureuser/realtime_translator/venv) azureuser@VoiceCloning:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
from realtimestt.
Should only be a warning, not an error. The script should work nevertheless (tested on Firebox 130.0 and Chrome 128.0.6613.120, Windows 11 64-Bit).
If anybody knows how to get rid of that warning, I'm happy to hear ideas or take PRs.
i left language parameter empty and spoke hindi language, some times it translated and sometime it gave wrong sentences.
please tell what parameters i should ensure for best realtime STT in any of the supported language
from realtimestt.
ok, it started working. i used following to resolve it. ''' sudo apt-get install libcudnn8 '''
That should be the correct way to solve that. faster_whisper still needs cuDNN 8 on Linux.
from realtimestt.
i left language parameter empty and spoke hindi language, some times it translated and sometime it gave wrong sentences.
please tell what parameters i should ensure for best realtime STT in any of the supported language
- At first change your realtime_model_type. "tiny.en" as default is only good for english. For hindi you need a higher model. Try medium or if this does not work good enough, try large or large-v2/large-v3.
- Set the language parameter "hi" for Hindi as default. Whisper performance is way better with a fixed language.
- Maybe try large-v3 for the final transcription, since with a fixed language parameter set this sometimes delivers better performance than large-v2 for the final transcription.
So my first take to improve this would be to open server.py and change recorder_config to:
recorder_config = {
'spinner': False,
'use_microphone': False,
'model': 'large-v3',
'language': 'hi',
'silero_sensitivity': 0.4,
'webrtc_sensitivity': 2,
'post_speech_silence_duration': 0.7,
'min_length_of_recording': 0,
'min_gap_between_recordings': 0,
'enable_realtime_transcription': True,
'realtime_processing_pause': 0,
'realtime_model_type': 'medium',
'on_realtime_transcription_stabilized': text_detected,
}
from realtimestt.
You can see all available language codes here btw
from realtimestt.
it's not giving good output for Hindi; can I also use model checkpoint from hugging face?
from realtimestt.
It should be possible to load any huggingface model by specifying user and model name with "username/modelname".
Please also try higher beam sizes for transcription and float16 precision in recorder_config:
'beam_size': 10,
'beam_size_realtime': 7,
'compute_type': 'float16',
from realtimestt.
i was trying to get "vasista22/whisper-hindi-medium" from hugging face but it gave following error.
i checked, this was working for normal wisper inference code but here it gave me error
error:
Starting server, please wait...
Initializing RealtimeSTT...
config.json: 100%|████████████████| 1.29k/1.29k [00:00<00:00, 9.02MB/s]
preprocessor_config.json: 100%|██████| 185k/185k [00:00<00:00, 889kB/s]
RealTimeSTT: root - ERROR - Error initializing faster_whisper realtime transcription model: Unable to open file 'model.bin' in model '/home/azureuser/.cache/huggingface/hub/models--vasista22--whisper-hindi-medium/snapshots/d53532a4dc1d0d89e484ed8f7acfb2228a7d3785'
Traceback (most recent call last):
File "/home/azureuser/realtime_translator/RealtimeSTT/audio_recorder.py", line 513, in init
self.realtime_model_type = faster_whisper.WhisperModel(
File "/home/azureuser/realtime_translator/venv/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 145, in init
self.model = ctranslate2.models.Whisper(
RuntimeError: Unable to open file 'model.bin' in model '/home/azureuser/.cache/huggingface/hub/models--vasista22--whisper-hindi-medium/snapshots/d53532a4dc1d0d89e484ed8f7acfb2228a7d3785'
Exception in thread Thread-1 (recorder_thread):
Traceback (most recent call last):
File "/home/azureuser/realtime_translator/venv/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
self.run()
File "/home/azureuser/realtime_translator/venv/lib/python3.10/threading.py", line 946, in run
self._target(*self._args, **self._kwargs)
File "/home/azureuser/realtime_translator/server.py", line 53, in recorder_thread
recorder = AudioToTextRecorder(**recorder_config)
File "/home/azureuser/realtime_translator/RealtimeSTT/audio_recorder.py", line 513, in init
self.realtime_model_type = faster_whisper.WhisperModel(
File "/home/azureuser/realtime_translator/venv/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 145, in init
self.model = ctranslate2.models.Whisper(
RuntimeError: Unable to open file 'model.bin' in model '/home/azureuser/.cache/huggingface/hub/models--vasista22--whisper-hindi-medium/snapshots/d53532a4dc1d0d89e484ed8f7acfb2228a7d3785'
Exception in thread Thread-2 (_transcription_worker):
Traceback (most recent call last):
File "/home/azureuser/realtime_translator/venv/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
self.run()
File "/home/azureuser/realtime_translator/venv/lib/python3.10/threading.py", line 946, in run
self._target(*self._args, **self._kwargs)
File "/home/azureuser/realtime_translator/RealtimeSTT/audio_recorder.py", line 775, in _transcription_worker
audio, language = conn.recv()
File "/home/azureuser/realtime_translator/venv/lib/python3.10/multiprocessing/connection.py", line 255, in recv
buf = self._recv_bytes()
File "/home/azureuser/realtime_translator/venv/lib/python3.10/multiprocessing/connection.py", line 419, in _recv_bytes
buf = self._recv(4)
File "/home/azureuser/realtime_translator/venv/lib/python3.10/multiprocessing/connection.py", line 388, in _recv
raise EOFError
EOFError
from realtimestt.
Maybe it needs to be converted to CTranslate2 before, not sure. This is out of scope for RealtimeSTT, I'd discuss this in the faster_whisper repo.
from realtimestt.
Related Issues (20)
- Extract Phonemes from script.
- Interrupt the process uisng STT HOT 1
- packages for realtimestt cannot be found. HOT 1
- Noise reduction/Sensitivity HOT 9
- [MacOS Sonoma 14.5 - Intel] EOF ERROR in multiprocessing HOT 1
- STT: UnpicklingError: invalid load key, '\x00' HOT 6
- Can we dynamiclly detect the language code? HOT 3
- How do I fix this? HOT 3
- possible typo on RealtimeSTT/audio_recorder.py:1474 - a colon looks to be commented out HOT 2
- shutdown hanging on thread join HOT 2
- STT as a Websocket (server side) HOT 1
- RealtimeSTT on AMD Guide
- Real-Time transcribtion works poorly HOT 1
- Reset class instance state (reset text buffers) HOT 2
- Docker is not working HOT 6
- New Function: Use transcription as prompts HOT 1
- The last audio segment is always lost during real-time transcription HOT 2
- translate not working for transcribe method of faster-whisper HOT 2
- Try reconnection mic.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from realtimestt.