stlukey / whispercpp.py Goto Github PK

Python bindings for whisper.cpp

License: MIT License

Python 100.00%

whispercpp.py's Introduction

Python bindings for whisper.cpp

pip install git+https://github.com/stlukey/whispercpp.py

from whispercpp import Whisper

w = Whisper('tiny')

result = w.transcribe("myfile.mp3")
text = w.extract_text(result)

Note: default parameters might need to be tweaked. See Whispercpp.pyx.

whispercpp.py's People

Contributors

Stargazers

Watchers

Forkers

chidiwilliams robbiesoutham whisper-transcribe-bot spullara kingjan1999 mkordaci boolemancer rlrs iantanwx miraclebakelaser myaiatemyhomework mxpucci aidanperkins jaimebw mico-boje pavelzbornik yifree defmyself praveenvnktsh awentzonline eduritez oleg-panichev jinsu35 melihogutcen patrickjae rumia-channel de-zix tibssy ogier244 maychell-kuver rookiee111 ashsrobbins bryceamacker amtam0 sagewhocodes lampts dfaisalmahmood arjunkmrm ekremcet lwdgit pratikmohanty rmohl neocho moritzkronberger volker48 gunlinux markolonius kimwoonggon andredalwin alansary alonsosilvaallende johnryan465 chrisz236 shimudong ananthkothuri riyanparvez andreivisan zhangp365 dtkav xiaoyubing wuheyi gptjddldi frankwu5099 afg1 kata-kas timmb agorlov cjohn001 ilyachikildin fede996 ryanamundson1 richyxi rotlir aigaosheng jimmys-code robotanica kaushalapptware readytodance workshed kadenchoi fitz-nguyen genevera ysenarath wumch freeroyalties najam-tariq

whispercpp.py's Issues

[Question] How to get the start and end time?

Hi! How can I get the start and end time for each sentence? I saw them printed in the terminal but not sure how to extract them.

The fillers are filtered while transcribing.

All the fillers like "um" "uh" "mm" etc are getting filtered out when transcribing the file.
Could I change some setting to not to do this ?

Github Workflow

The workflows need to be created to build the wheel for multiple system types. This way we can expand them to test #3 and #4.

TypeError: argument should be a str object or an os.PathLike object returning str, not <class 'bytes'>

w = Whisper('tiny')
Traceback (most recent call last):
File "", line 1, in
File "whispercpp.pyx", line 89, in whispercpp.Whisper.init
File "whispercpp.pyx", line 34, in whispercpp.download_model
File "whispercpp.pyx", line 31, in whispercpp.model_exists
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pathlib.py", line 851, in joinpath
return self._make_child(args)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pathlib.py", line 616, in _make_child
drv, root, parts = self._parse_args(args)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pathlib.py", line 583, in _parse_args
raise TypeError(
TypeError: argument should be a str object or an os.PathLike object returning str, not <class 'bytes'>

I am facing this issue, when trying initialise whisper object with "tiny" model

Is there a way to select the language ?

Hi there,

There is no documentation but I am curious if there is the possibility to set the language of the transcription

Download models from huggingface

The models need to be downloaded from hugging face instead.

Realtime streaming example?

Hey, can you provide an example, as to how we can run stream example as provided in whisper.cpp, with your code in python. Thanks!!!

Type error , tried everything nothing working

File "C:\Program Files\Python310\lib\pathlib.py", line 583, in _parse_args
raise TypeError(
TypeError: argument should be a str object or an os.PathLike object returning str, not <class 'bytes'>

get progress updates from python

Hello! I trying to make a gui voice transcriber in python but i couldn't figure out how to get progress of recognition.
Also, can I fix the small ggml model's spelling errors? Maybe with some apis...
Thanks in advance.

All models cause Python to core dump

If I load the medium.en model with whisper.cpp, it runs fine. But I get a core dump trying to do the same thing with this wrapper:

>>> w = Whisper('medium')
Downloading ggml-medium.bin...
whisper_init_from_file_no_state: loading model from '/home/chris/.ggml-models/ggml-medium.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1024
whisper_model_load: n_audio_head  = 16
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1024
whisper_model_load: n_text_head   = 16
whisper_model_load: n_text_layer  = 24
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1
whisper_model_load: type          = 4
whisper_model_load: mem required  = 1725.00 MB (+   43.00 MB per decoder)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx     = 1462.35 MB
Illegal instruction (core dumped)

Am I doing something wrong?

Crashes on windows

Seems to crash (ends program and Python interpreter) on Windows during transcription (when calling whisper_full) without any other output or error.

Set params?

How do you access and set the various parameters passed to Whisper? I see the params structure in the Whisper class, but it appears to be private and not externally accessible.

>>> from whispercpp import Whisper
>>> w = Whisper('large')
>>> w.params.processors = 4
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'whispercpp.Whisper' object has no attribute 'params'

Some compilation options are for amd64 only, doesn't work on arm64

On a tiny Hetzner ARM instance, a simple pip install . raises an error :

Building wheels for collected packages: whispercpp
  Building wheel for whispercpp (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for whispercpp (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [5 lines of output]
      aarch64-linux-gnu-gcc: error: unrecognized command-line option ‘-mavx’
      aarch64-linux-gnu-gcc: error: unrecognized command-line option ‘-mavx2’
      aarch64-linux-gnu-gcc: error: unrecognized command-line option ‘-mfma’
      aarch64-linux-gnu-gcc: error: unrecognized command-line option ‘-mf16c’
      error: command '/usr/bin/aarch64-linux-gnu-gcc' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for whispercpp

Missing Licence

Hi @stlukey ,

Thanks for your work! Is there a interest from your side to add a license? Without any license its practically not usable for anybody.

Building wheels on WSL fails

Hi,

when I try to install the package (poetry install) in a poetry project with the following pyproject.toml:

[tool.poetry]
name = "whispercppy-test"
version = "0.1.0"
description = ""
authors = ["Your Name <[email protected]>"]
readme = "README.md"
packages = [{include = "whispercppy_test"}]

[tool.poetry.dependencies]
python = "^3.10"
whispercpp = {git = "https://github.com/o4dev/whispercpp.py"}


[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

I get the following error when building the whispercpp.py wheels:

pux@Hannes-PC:/mnt/c/Programs/whispercppy-test$ poetry install
Installing dependencies from lock file
Warning: poetry.lock is not consistent with pyproject.toml. You may be getting improper dependencies. Run `poetry lock [--no-update]` to fix it.

Package operations: 1 install, 0 updates, 0 removals

  • Installing whispercpp (1.0 275783f): Failed

  CalledProcessError

  Command '['/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/bin/python', '-m', 'pip', 'install', '--use-pep517', '--disable-pip-version-check', '--isolated', '--no-input', '--prefix', '/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10', '--upgrade', '--no-deps', '/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/src/whispercpp.py']' returned non-zero exit status 1.

  at /usr/lib/python3.10/subprocess.py:524 in run
       520│             # We don't call process.wait() as .__exit__ does that for us.
       521│             raise
       522│         retcode = process.poll()
       523│         if check and retcode:
    →  524│             raise CalledProcessError(retcode, process.args,
       525│                                      output=stdout, stderr=stderr)
       526│     return CompletedProcess(process.args, retcode, stdout, stderr)
       527│ 
       528│ 

The following error occurred when trying to handle this error:


  EnvCommandError

  Command ['/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/bin/python', '-m', 'pip', 'install', '--use-pep517', '--disable-pip-version-check', '--isolated', '--no-input', '--prefix', '/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10', '--upgrade', '--no-deps', '/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/src/whispercpp.py'] errored with the following return code 1, and output: 
  Processing /home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/src/whispercpp.py
    Installing build dependencies: started
    Installing build dependencies: finished with status 'done'
    Getting requirements to build wheel: started
    Getting requirements to build wheel: finished with status 'done'
    Preparing metadata (pyproject.toml): started
    Preparing metadata (pyproject.toml): finished with status 'done'
  Building wheels for collected packages: whispercpp
    Building wheel for whispercpp (pyproject.toml): started
    Building wheel for whispercpp (pyproject.toml): finished with status 'error'
    error: subprocess-exited-with-error
    
    × Building wheel for whispercpp (pyproject.toml) did not run successfully.
    │ exit code: 1
    ╰─> [13 lines of output]
        ./whisper.cpp/ggml.c: In function ‘ggml_time_ms’:
        ./whisper.cpp/ggml.c:269:5: warning: implicit declaration of function ‘clock_gettime’ [-Wimplicit-function-declaration]
          269 |     clock_gettime(CLOCK_MONOTONIC, &ts);
              |     ^~~~~~~~~~~~~
        ./whisper.cpp/ggml.c:269:19: error: ‘CLOCK_MONOTONIC’ undeclared (first use in this function)
          269 |     clock_gettime(CLOCK_MONOTONIC, &ts);
              |                   ^~~~~~~~~~~~~~~
        ./whisper.cpp/ggml.c:269:19: note: each undeclared identifier is reported only once for each function it appears in
        ./whisper.cpp/ggml.c: In function ‘ggml_time_us’:
        ./whisper.cpp/ggml.c:275:19: error: ‘CLOCK_MONOTONIC’ undeclared (first use in this function)
          275 |     clock_gettime(CLOCK_MONOTONIC, &ts);
              |                   ^~~~~~~~~~~~~~~
        error: command '/usr/bin/x86_64-linux-gnu-gcc' failed with exit code 1
        [end of output]
    
    note: This error originates from a subprocess, and is likely not a problem with pip.
    ERROR: Failed building wheel for whispercpp
  Failed to build whispercpp
  ERROR: Could not build wheels for whispercpp, which is required to install pyproject.toml-based projects
  

  at ~/.local/share/pypoetry/venv/lib/python3.10/site-packages/poetry/utils/env.py:1540 in _run
      1536│                 output = subprocess.check_output(
      1537│                     command, stderr=subprocess.STDOUT, env=env, **kwargs
      1538│                 )
      1539│         except CalledProcessError as e:
    → 1540│             raise EnvCommandError(e, input=input_)
      1541│ 
      1542│         return decode(output)
      1543│ 
      1544│     def execute(self, bin: str, *args: str, **kwargs: Any) -> int:

The following error occurred when trying to handle this error:


  PoetryException

  Failed to install /home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/src/whispercpp.py

  at ~/.local/share/pypoetry/venv/lib/python3.10/site-packages/poetry/utils/pip.py:58 in pip_install
       54│ 
       55│     try:
       56│         return environment.run_pip(*args)
       57│     except EnvCommandError as e:
    →  58│         raise PoetryException(f"Failed to install {path.as_posix()}") from e
       59│

(I just tested it with pip, there I get the same error.)

PS: Thanks a lot for all the work you put in this valuable project!

Quantization model support?

Thanks for writing this. Seems to work well.

However, is there a way to use quantized models with this? I see they aren't included in your MODELS dictionary.

If I hot-patch it like:

from whispercpp import Whisper, MODELS
MODELS['ggml-large-q5_0.bin'] = 'https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-q5_0.bin'
w = Whisper('large-q5_0')

that makes it download the model, but when it tries to load the model, it fails with:

Downloading ggml-large-q5_0.bin...
whisper_init_from_file_no_state: loading model from '/home/chris/.ggml-models/ggml-large-q5_0.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head  = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1280
whisper_model_load: n_text_head   = 20
whisper_model_load: n_text_layer  = 32
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1008
whisper_model_load: type          = 5
whisper_model_load: mem required  = 3342.00 MB (+   71.00 MB per decoder)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx     = 2950.97 MB
��isper_model_load: unknown tensor '0y��q
  K.7��eţ�k�ؠ��	�͠[email protected]�D��^#��9]�|���N(f�fm����:�@lc��QwO�oezg{��-!�Ě���' in model file
whisper_init_no_state: failed to load model

add support for other languages

It's possible to change the default language? How?

Cuda support?

Support specified language to decode

Thank you very much for your work. I want to use this package as speech-to-text to transcribe Spanish. When I change the language in both files to 'es', it seems to understand Spanish, however it transcribes in English. For example, I say "Cómo va la vida?" and it transcribes "How is life?" which is the correct translation of the phrase in English (so it understands what I say). Any idea how to change this? I am sure that I want to transcribe in Spanish so I don't need automatic dection of language or anything like that. Do you know how to tweak it so it does just that?
Thank you very much.

Pip Install failing on windows

when I install it through the pip command it always gives me this error: ` error: subprocess-exited-with-error

× Building wheel for whispercpp (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [1 lines of output]
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for whispercpp
Failed to build whispercpp
ERROR: Could not build wheels for whispercpp, which is required to install pyproject.toml-based projects`

I have also installed the microsoft visual c++ 14 too, but it still does not work.

ggml-large.bin doesn't exist anymore on hugginface

large model doesn't work :

>>> w = Whisper('large')
Downloading ggml-large.bin...
whisper_init_from_file_no_state: loading model from '/Users/mlecarme/.ggml-models/ggml-large.bin'
whisper_model_load: loading model
whisper_model_load: invalid model data (bad magic)
whisper_init_no_state: failed to load model

You have to pick v1, v2 or v3.

See https://huggingface.co/ggerganov/whisper.cpp/tree/main

Crashing on Linux for me

Transcribing..
[New Thread 0x7f040122d700 (LWP 226)]
[New Thread 0x7f0401a2e700 (LWP 227)]

Thread 33 "python" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f040122d700 (LWP 226)]
0x00007f044d9c8f41 in <lambda(int)>::operator() (ith=<optimized out>, __closure=0x1e3b430)
    at ./whisper.cpp/whisper.cpp:2089
2089    ./whisper.cpp/whisper.cpp: Is a directory.
Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.5.8-3.amzn2.0.2.x86_64 krb5-libs-1.15.1-37.amzn2.2.2.x86_64 libcom_err-1.42.9-19.amzn2.x86_64 libffi-3.0.13-18.amzn2.0.2.x86_64 libgcc-7.3.1-13.amzn2.x86_64 libselinux-2.5-12.amzn2.0.2.x86_64 libstdc++-7.3.1-13.amzn2.x86_64 openssl-libs-1.0.2k-19.amzn2.0.8.x86_64 pcre-8.32-17.amzn2.0.2.x86_64 zlib-1.2.7-18.amzn2.x86_64
(gdb) where
#0  0x00007f044d9c8f41 in <lambda(int)>::operator() (ith=<optimized out>, __closure=0x1e3b430)
    at ./whisper.cpp/whisper.cpp:2089
#1  std::__invoke_impl<void, log_mel_spectrogram(float const*, int, int, int, int, int, int, const whisper_filters&, bool, whisper_mel&)::<lambda(int)>, int> (__f=...) at /usr/include/c++/7/bits/invoke.h:60
#2  std::__invoke<log_mel_spectrogram(float const*, int, int, int, int, int, int, const whisper_filters&, bool, whisper_mel&)::<lambda(int)>, int> (__fn=...) at /usr/include/c++/7/bits/invoke.h:95
#3  std::thread::_Invoker<std::tuple<log_mel_spectrogram(float const*, int, int, int, int, int, int, const whisper_filters&, bool, whisper_mel&)::<lambda(int)>, int> >::_M_invoke<0, 1> (this=0x1e3b428) at /usr/include/c++/7/thread:234
#4  std::thread::_Invoker<std::tuple<log_mel_spectrogram(float const*, int, int, int, int, int, int, const whisper_filters&, bool, whisper_mel&)::<lambda(int)>, int> >::operator() (this=0x1e3b428) at /usr/include/c++/7/thread:243
#5  std::thread::_State_impl<std::thread::_Invoker<std::tuple<log_mel_spectrogram(float const*, int, int, int, int, int, int, const whisper_filters&, bool, whisper_mel&)::<lambda(int)>, int> > >::_M_run(void) (this=0x1e3b420)
    at /usr/include/c++/7/thread:186
#6  0x00007f044d6c2acf in ?? () from /lib64/libstdc++.so.6
#7  0x00007f045531440b in start_thread () from /lib64/libpthread.

pip install failed in Mac M2

when run the command pip install git+https://github.com/stlukey/whispercpp.py

command '/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_810lo85vyi/croot/python-split_1678271120546/_build_env/bin/llvm-ar' failed: No such file or directory
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for whispercpp
Failed to build whispercpp
ERROR: Could not build wheels for whispercpp, which is required to install pyproject.toml-based projects

I had refered this link:https://tipseason.com/carbon-language-execvp-error/, but can not work.

stlukey / whispercpp.py Goto Github PK

whispercpp.py's Introduction

Python bindings for whisper.cpp

whispercpp.py's People

Contributors

Stargazers

Watchers

Forkers

whispercpp.py's Issues

Recommend Projects

Recommend Topics

Recommend Org