Giter Club home page Giter Club logo

whispercpp.py's Introduction

Python bindings for whisper.cpp

Buy Me A Coffee


pip install git+https://github.com/stlukey/whispercpp.py

from whispercpp import Whisper

w = Whisper('tiny')

result = w.transcribe("myfile.mp3")
text = w.extract_text(result)

Note: default parameters might need to be tweaked. See Whispercpp.pyx.

whispercpp.py's People

Contributors

aidanperkins avatar boolemancer avatar kingjan1999 avatar rlrs avatar robbiesoutham avatar stlukey avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

whispercpp.py's Issues

Github Workflow

The workflows need to be created to build the wheel for multiple system types. This way we can expand them to test #3 and #4.

TypeError: argument should be a str object or an os.PathLike object returning str, not <class 'bytes'>

w = Whisper('tiny')
Traceback (most recent call last):
File "", line 1, in
File "whispercpp.pyx", line 89, in whispercpp.Whisper.init
File "whispercpp.pyx", line 34, in whispercpp.download_model
File "whispercpp.pyx", line 31, in whispercpp.model_exists
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pathlib.py", line 851, in joinpath
return self._make_child(args)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pathlib.py", line 616, in _make_child
drv, root, parts = self._parse_args(args)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pathlib.py", line 583, in _parse_args
raise TypeError(
TypeError: argument should be a str object or an os.PathLike object returning str, not <class 'bytes'>

I am facing this issue, when trying initialise whisper object with "tiny" model

Realtime streaming example?

Hey, can you provide an example, as to how we can run stream example as provided in whisper.cpp, with your code in python. Thanks!!!

Type error , tried everything nothing working

File "C:\Program Files\Python310\lib\pathlib.py", line 583, in _parse_args
raise TypeError(
TypeError: argument should be a str object or an os.PathLike object returning str, not <class 'bytes'>

get progress updates from python

Hello! I trying to make a gui voice transcriber in python but i couldn't figure out how to get progress of recognition.
Also, can I fix the small ggml model's spelling errors? Maybe with some apis...
Thanks in advance.

All models cause Python to core dump

If I load the medium.en model with whisper.cpp, it runs fine. But I get a core dump trying to do the same thing with this wrapper:

>>> w = Whisper('medium')
Downloading ggml-medium.bin...
whisper_init_from_file_no_state: loading model from '/home/chris/.ggml-models/ggml-medium.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1024
whisper_model_load: n_audio_head  = 16
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1024
whisper_model_load: n_text_head   = 16
whisper_model_load: n_text_layer  = 24
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1
whisper_model_load: type          = 4
whisper_model_load: mem required  = 1725.00 MB (+   43.00 MB per decoder)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx     = 1462.35 MB
Illegal instruction (core dumped)

Am I doing something wrong?

Crashes on windows

Seems to crash (ends program and Python interpreter) on Windows during transcription (when calling whisper_full) without any other output or error.

Set params?

How do you access and set the various parameters passed to Whisper? I see the params structure in the Whisper class, but it appears to be private and not externally accessible.

>>> from whispercpp import Whisper
>>> w = Whisper('large')
>>> w.params.processors = 4
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'whispercpp.Whisper' object has no attribute 'params'

Some compilation options are for amd64 only, doesn't work on arm64

On a tiny Hetzner ARM instance, a simple pip install . raises an error :

Building wheels for collected packages: whispercpp
  Building wheel for whispercpp (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for whispercpp (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [5 lines of output]
      aarch64-linux-gnu-gcc: error: unrecognized command-line option ‘-mavx’
      aarch64-linux-gnu-gcc: error: unrecognized command-line option ‘-mavx2’
      aarch64-linux-gnu-gcc: error: unrecognized command-line option ‘-mfma’
      aarch64-linux-gnu-gcc: error: unrecognized command-line option ‘-mf16c’
      error: command '/usr/bin/aarch64-linux-gnu-gcc' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for whispercpp

Missing Licence

Hi @stlukey ,

Thanks for your work! Is there a interest from your side to add a license? Without any license its practically not usable for anybody.

Building wheels on WSL fails

Hi,

when I try to install the package (poetry install) in a poetry project with the following pyproject.toml:

[tool.poetry]
name = "whispercppy-test"
version = "0.1.0"
description = ""
authors = ["Your Name <[email protected]>"]
readme = "README.md"
packages = [{include = "whispercppy_test"}]

[tool.poetry.dependencies]
python = "^3.10"
whispercpp = {git = "https://github.com/o4dev/whispercpp.py"}


[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

I get the following error when building the whispercpp.py wheels:

pux@Hannes-PC:/mnt/c/Programs/whispercppy-test$ poetry install
Installing dependencies from lock file
Warning: poetry.lock is not consistent with pyproject.toml. You may be getting improper dependencies. Run `poetry lock [--no-update]` to fix it.

Package operations: 1 install, 0 updates, 0 removals

  • Installing whispercpp (1.0 275783f): Failed

  CalledProcessError

  Command '['/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/bin/python', '-m', 'pip', 'install', '--use-pep517', '--disable-pip-version-check', '--isolated', '--no-input', '--prefix', '/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10', '--upgrade', '--no-deps', '/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/src/whispercpp.py']' returned non-zero exit status 1.

  at /usr/lib/python3.10/subprocess.py:524 in run
       520│             # We don't call process.wait() as .__exit__ does that for us.
       521│             raise
       522│         retcode = process.poll()
       523│         if check and retcode:
    →  524│             raise CalledProcessError(retcode, process.args,
       525│                                      output=stdout, stderr=stderr)
       526│     return CompletedProcess(process.args, retcode, stdout, stderr)
       527│ 
       528│ 

The following error occurred when trying to handle this error:


  EnvCommandError

  Command ['/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/bin/python', '-m', 'pip', 'install', '--use-pep517', '--disable-pip-version-check', '--isolated', '--no-input', '--prefix', '/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10', '--upgrade', '--no-deps', '/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/src/whispercpp.py'] errored with the following return code 1, and output: 
  Processing /home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/src/whispercpp.py
    Installing build dependencies: started
    Installing build dependencies: finished with status 'done'
    Getting requirements to build wheel: started
    Getting requirements to build wheel: finished with status 'done'
    Preparing metadata (pyproject.toml): started
    Preparing metadata (pyproject.toml): finished with status 'done'
  Building wheels for collected packages: whispercpp
    Building wheel for whispercpp (pyproject.toml): started
    Building wheel for whispercpp (pyproject.toml): finished with status 'error'
    error: subprocess-exited-with-error
    
    × Building wheel for whispercpp (pyproject.toml) did not run successfully.
    │ exit code: 1
    ╰─> [13 lines of output]
        ./whisper.cpp/ggml.c: In function ‘ggml_time_ms’:
        ./whisper.cpp/ggml.c:269:5: warning: implicit declaration of function ‘clock_gettime’ [-Wimplicit-function-declaration]
          269 |     clock_gettime(CLOCK_MONOTONIC, &ts);
              |     ^~~~~~~~~~~~~
        ./whisper.cpp/ggml.c:269:19: error: ‘CLOCK_MONOTONIC’ undeclared (first use in this function)
          269 |     clock_gettime(CLOCK_MONOTONIC, &ts);
              |                   ^~~~~~~~~~~~~~~
        ./whisper.cpp/ggml.c:269:19: note: each undeclared identifier is reported only once for each function it appears in
        ./whisper.cpp/ggml.c: In function ‘ggml_time_us’:
        ./whisper.cpp/ggml.c:275:19: error: ‘CLOCK_MONOTONIC’ undeclared (first use in this function)
          275 |     clock_gettime(CLOCK_MONOTONIC, &ts);
              |                   ^~~~~~~~~~~~~~~
        error: command '/usr/bin/x86_64-linux-gnu-gcc' failed with exit code 1
        [end of output]
    
    note: This error originates from a subprocess, and is likely not a problem with pip.
    ERROR: Failed building wheel for whispercpp
  Failed to build whispercpp
  ERROR: Could not build wheels for whispercpp, which is required to install pyproject.toml-based projects
  

  at ~/.local/share/pypoetry/venv/lib/python3.10/site-packages/poetry/utils/env.py:1540 in _run
      1536│                 output = subprocess.check_output(
      1537│                     command, stderr=subprocess.STDOUT, env=env, **kwargs
      1538│                 )
      1539│         except CalledProcessError as e:
    → 1540│             raise EnvCommandError(e, input=input_)
      1541│ 
      1542│         return decode(output)
      1543│ 
      1544│     def execute(self, bin: str, *args: str, **kwargs: Any) -> int:

The following error occurred when trying to handle this error:


  PoetryException

  Failed to install /home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/src/whispercpp.py

  at ~/.local/share/pypoetry/venv/lib/python3.10/site-packages/poetry/utils/pip.py:58 in pip_install
       54│ 
       55│     try:
       56│         return environment.run_pip(*args)
       57│     except EnvCommandError as e:
    →  58│         raise PoetryException(f"Failed to install {path.as_posix()}") from e
       59│ 

(I just tested it with pip, there I get the same error.)

PS: Thanks a lot for all the work you put in this valuable project!

Quantization model support?

Thanks for writing this. Seems to work well.

However, is there a way to use quantized models with this? I see they aren't included in your MODELS dictionary.

If I hot-patch it like:

from whispercpp import Whisper, MODELS
MODELS['ggml-large-q5_0.bin'] = 'https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-q5_0.bin'
w = Whisper('large-q5_0')

that makes it download the model, but when it tries to load the model, it fails with:

Downloading ggml-large-q5_0.bin...
whisper_init_from_file_no_state: loading model from '/home/chris/.ggml-models/ggml-large-q5_0.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head  = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1280
whisper_model_load: n_text_head   = 20
whisper_model_load: n_text_layer  = 32
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1008
whisper_model_load: type          = 5
whisper_model_load: mem required  = 3342.00 MB (+   71.00 MB per decoder)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx     = 2950.97 MB
��isper_model_load: unknown tensor '0y��q
  K.7��eţ�k�ؠ��	�͠[email protected]�D��^#��9]�|���N(f�fm����:�@lc��QwO�oezg{��-!�Ě���' in model file
whisper_init_no_state: failed to load model

Support specified language to decode

Thank you very much for your work. I want to use this package as speech-to-text to transcribe Spanish. When I change the language in both files to 'es', it seems to understand Spanish, however it transcribes in English. For example, I say "Cómo va la vida?" and it transcribes "How is life?" which is the correct translation of the phrase in English (so it understands what I say). Any idea how to change this? I am sure that I want to transcribe in Spanish so I don't need automatic dection of language or anything like that. Do you know how to tweak it so it does just that?
Thank you very much.

Pip Install failing on windows

when I install it through the pip command it always gives me this error: ` error: subprocess-exited-with-error

× Building wheel for whispercpp (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [1 lines of output]
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for whispercpp
Failed to build whispercpp
ERROR: Could not build wheels for whispercpp, which is required to install pyproject.toml-based projects`

I have also installed the microsoft visual c++ 14 too, but it still does not work.

Crashing on Linux for me

Transcribing..
[New Thread 0x7f040122d700 (LWP 226)]
[New Thread 0x7f0401a2e700 (LWP 227)]

Thread 33 "python" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f040122d700 (LWP 226)]
0x00007f044d9c8f41 in <lambda(int)>::operator() (ith=<optimized out>, __closure=0x1e3b430)
    at ./whisper.cpp/whisper.cpp:2089
2089    ./whisper.cpp/whisper.cpp: Is a directory.
Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.5.8-3.amzn2.0.2.x86_64 krb5-libs-1.15.1-37.amzn2.2.2.x86_64 libcom_err-1.42.9-19.amzn2.x86_64 libffi-3.0.13-18.amzn2.0.2.x86_64 libgcc-7.3.1-13.amzn2.x86_64 libselinux-2.5-12.amzn2.0.2.x86_64 libstdc++-7.3.1-13.amzn2.x86_64 openssl-libs-1.0.2k-19.amzn2.0.8.x86_64 pcre-8.32-17.amzn2.0.2.x86_64 zlib-1.2.7-18.amzn2.x86_64
(gdb) where
#0  0x00007f044d9c8f41 in <lambda(int)>::operator() (ith=<optimized out>, __closure=0x1e3b430)
    at ./whisper.cpp/whisper.cpp:2089
#1  std::__invoke_impl<void, log_mel_spectrogram(float const*, int, int, int, int, int, int, const whisper_filters&, bool, whisper_mel&)::<lambda(int)>, int> (__f=...) at /usr/include/c++/7/bits/invoke.h:60
#2  std::__invoke<log_mel_spectrogram(float const*, int, int, int, int, int, int, const whisper_filters&, bool, whisper_mel&)::<lambda(int)>, int> (__fn=...) at /usr/include/c++/7/bits/invoke.h:95
#3  std::thread::_Invoker<std::tuple<log_mel_spectrogram(float const*, int, int, int, int, int, int, const whisper_filters&, bool, whisper_mel&)::<lambda(int)>, int> >::_M_invoke<0, 1> (this=0x1e3b428) at /usr/include/c++/7/thread:234
#4  std::thread::_Invoker<std::tuple<log_mel_spectrogram(float const*, int, int, int, int, int, int, const whisper_filters&, bool, whisper_mel&)::<lambda(int)>, int> >::operator() (this=0x1e3b428) at /usr/include/c++/7/thread:243
#5  std::thread::_State_impl<std::thread::_Invoker<std::tuple<log_mel_spectrogram(float const*, int, int, int, int, int, int, const whisper_filters&, bool, whisper_mel&)::<lambda(int)>, int> > >::_M_run(void) (this=0x1e3b420)
    at /usr/include/c++/7/thread:186
#6  0x00007f044d6c2acf in ?? () from /lib64/libstdc++.so.6
#7  0x00007f045531440b in start_thread () from /lib64/libpthread.

pip install failed in Mac M2

when run the command pip install git+https://github.com/stlukey/whispercpp.py

command '/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_810lo85vyi/croot/python-split_1678271120546/_build_env/bin/llvm-ar' failed: No such file or directory
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for whispercpp
Failed to build whispercpp
ERROR: Could not build wheels for whispercpp, which is required to install pyproject.toml-based projects

I had refered this link:https://tipseason.com/carbon-language-execvp-error/, but can not work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.