Giter Club home page Giter Club logo

Comments (11)

vpssa avatar vpssa commented on September 17, 2024 1

ok, it started working.
i used following to resolve it.
'''
sudo apt-get install libcudnn8
'''

from realtimestt.

KoljaB avatar KoljaB commented on September 17, 2024

Should only be a warning, not an error. The script should work nevertheless (tested on Firebox 130.0 and Chrome 128.0.6613.120, Windows 11 64-Bit).

If anybody knows how to get rid of that warning, I'm happy to hear ideas or take PRs.

from realtimestt.

vpssa avatar vpssa commented on September 17, 2024

could'nt resolve the following error:
Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory

(/home/azureuser/realtime_translator/venv) azureuser@VoiceCloning:/realtime_translator$ python server.py
Starting server, please wait...
Initializing RealtimeSTT...
RealtimeSTT initialized
Server started. Press Ctrl+C to stop the server.
/home/azureuser/realtime_translator/server.py:109: DeprecationWarning: There is no current event loop
asyncio.get_event_loop().run_until_complete(start_server)
/home/azureuser/realtime_translator/server.py:110: DeprecationWarning: There is no current event loop
asyncio.get_event_loop().run_forever()
Client connected
Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory
Aborted (core dumped)
(/home/azureuser/realtime_translator/venv) azureuser@VoiceCloning:
/realtime_translator$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

from realtimestt.

vpssa avatar vpssa commented on September 17, 2024

Should only be a warning, not an error. The script should work nevertheless (tested on Firebox 130.0 and Chrome 128.0.6613.120, Windows 11 64-Bit).

If anybody knows how to get rid of that warning, I'm happy to hear ideas or take PRs.

i left language parameter empty and spoke hindi language, some times it translated and sometime it gave wrong sentences.

please tell what parameters i should ensure for best realtime STT in any of the supported language

from realtimestt.

KoljaB avatar KoljaB commented on September 17, 2024

ok, it started working. i used following to resolve it. ''' sudo apt-get install libcudnn8 '''

That should be the correct way to solve that. faster_whisper still needs cuDNN 8 on Linux.

from realtimestt.

KoljaB avatar KoljaB commented on September 17, 2024

i left language parameter empty and spoke hindi language, some times it translated and sometime it gave wrong sentences.

please tell what parameters i should ensure for best realtime STT in any of the supported language

  1. At first change your realtime_model_type. "tiny.en" as default is only good for english. For hindi you need a higher model. Try medium or if this does not work good enough, try large or large-v2/large-v3.
  2. Set the language parameter "hi" for Hindi as default. Whisper performance is way better with a fixed language.
  3. Maybe try large-v3 for the final transcription, since with a fixed language parameter set this sometimes delivers better performance than large-v2 for the final transcription.

So my first take to improve this would be to open server.py and change recorder_config to:

    recorder_config = {
        'spinner': False,
        'use_microphone': False,
        'model': 'large-v3',
        'language': 'hi',
        'silero_sensitivity': 0.4,
        'webrtc_sensitivity': 2,
        'post_speech_silence_duration': 0.7,
        'min_length_of_recording': 0,
        'min_gap_between_recordings': 0,
        'enable_realtime_transcription': True,
        'realtime_processing_pause': 0,
        'realtime_model_type': 'medium',
        'on_realtime_transcription_stabilized': text_detected,
    }

from realtimestt.

KoljaB avatar KoljaB commented on September 17, 2024

You can see all available language codes here btw

from realtimestt.

vpssa avatar vpssa commented on September 17, 2024

it's not giving good output for Hindi; can I also use model checkpoint from hugging face?

from realtimestt.

KoljaB avatar KoljaB commented on September 17, 2024

It should be possible to load any huggingface model by specifying user and model name with "username/modelname".

Please also try higher beam sizes for transcription and float16 precision in recorder_config:

        'beam_size': 10,
        'beam_size_realtime': 7,
        'compute_type': 'float16',

from realtimestt.

vpssa avatar vpssa commented on September 17, 2024

i was trying to get "vasista22/whisper-hindi-medium" from hugging face but it gave following error.
i checked, this was working for normal wisper inference code but here it gave me error
error:
Starting server, please wait...
Initializing RealtimeSTT...
config.json: 100%|████████████████| 1.29k/1.29k [00:00<00:00, 9.02MB/s]
preprocessor_config.json: 100%|██████| 185k/185k [00:00<00:00, 889kB/s]
RealTimeSTT: root - ERROR - Error initializing faster_whisper realtime transcription model: Unable to open file 'model.bin' in model '/home/azureuser/.cache/huggingface/hub/models--vasista22--whisper-hindi-medium/snapshots/d53532a4dc1d0d89e484ed8f7acfb2228a7d3785'
Traceback (most recent call last):
File "/home/azureuser/realtime_translator/RealtimeSTT/audio_recorder.py", line 513, in init
self.realtime_model_type = faster_whisper.WhisperModel(
File "/home/azureuser/realtime_translator/venv/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 145, in init
self.model = ctranslate2.models.Whisper(
RuntimeError: Unable to open file 'model.bin' in model '/home/azureuser/.cache/huggingface/hub/models--vasista22--whisper-hindi-medium/snapshots/d53532a4dc1d0d89e484ed8f7acfb2228a7d3785'
Exception in thread Thread-1 (recorder_thread):
Traceback (most recent call last):
File "/home/azureuser/realtime_translator/venv/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
self.run()
File "/home/azureuser/realtime_translator/venv/lib/python3.10/threading.py", line 946, in run
self._target(*self._args, **self._kwargs)
File "/home/azureuser/realtime_translator/server.py", line 53, in recorder_thread
recorder = AudioToTextRecorder(**recorder_config)
File "/home/azureuser/realtime_translator/RealtimeSTT/audio_recorder.py", line 513, in init
self.realtime_model_type = faster_whisper.WhisperModel(
File "/home/azureuser/realtime_translator/venv/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 145, in init
self.model = ctranslate2.models.Whisper(
RuntimeError: Unable to open file 'model.bin' in model '/home/azureuser/.cache/huggingface/hub/models--vasista22--whisper-hindi-medium/snapshots/d53532a4dc1d0d89e484ed8f7acfb2228a7d3785'
Exception in thread Thread-2 (_transcription_worker):
Traceback (most recent call last):
File "/home/azureuser/realtime_translator/venv/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
self.run()
File "/home/azureuser/realtime_translator/venv/lib/python3.10/threading.py", line 946, in run
self._target(*self._args, **self._kwargs)
File "/home/azureuser/realtime_translator/RealtimeSTT/audio_recorder.py", line 775, in _transcription_worker
audio, language = conn.recv()
File "/home/azureuser/realtime_translator/venv/lib/python3.10/multiprocessing/connection.py", line 255, in recv
buf = self._recv_bytes()
File "/home/azureuser/realtime_translator/venv/lib/python3.10/multiprocessing/connection.py", line 419, in _recv_bytes
buf = self._recv(4)
File "/home/azureuser/realtime_translator/venv/lib/python3.10/multiprocessing/connection.py", line 388, in _recv
raise EOFError
EOFError

from realtimestt.

KoljaB avatar KoljaB commented on September 17, 2024

Maybe it needs to be converted to CTranslate2 before, not sure. This is out of scope for RealtimeSTT, I'd discuss this in the faster_whisper repo.

from realtimestt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.