pip install git+https://github.com/stlukey/whispercpp.py
from whispercpp import Whisper
w = Whisper('tiny')
result = w.transcribe("myfile.mp3")
text = w.extract_text(result)
Note: default parameters might need to be tweaked. See Whispercpp.pyx.
Python bindings for whisper.cpp
License: MIT License
All the fillers like "um" "uh" "mm" etc are getting filtered out when transcribing the file.
Could I change some setting to not to do this ?
w = Whisper('tiny')
Traceback (most recent call last):
File "", line 1, in
File "whispercpp.pyx", line 89, in whispercpp.Whisper.init
File "whispercpp.pyx", line 34, in whispercpp.download_model
File "whispercpp.pyx", line 31, in whispercpp.model_exists
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pathlib.py", line 851, in joinpath
return self._make_child(args)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pathlib.py", line 616, in _make_child
drv, root, parts = self._parse_args(args)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pathlib.py", line 583, in _parse_args
raise TypeError(
TypeError: argument should be a str object or an os.PathLike object returning str, not <class 'bytes'>
I am facing this issue, when trying initialise whisper object with "tiny" model
Hi there,
There is no documentation but I am curious if there is the possibility to set the language of the transcription
The models need to be downloaded from hugging face instead.
How can we add CoreML support? Thx!
Hey, can you provide an example, as to how we can run stream example as provided in whisper.cpp, with your code in python. Thanks!!!
File "C:\Program Files\Python310\lib\pathlib.py", line 583, in _parse_args
raise TypeError(
TypeError: argument should be a str object or an os.PathLike object returning str, not <class 'bytes'>
Hello! I trying to make a gui voice transcriber in python but i couldn't figure out how to get progress of recognition.
Also, can I fix the small ggml model's spelling errors? Maybe with some apis...
Thanks in advance.
If I load the medium.en model with whisper.cpp, it runs fine. But I get a core dump trying to do the same thing with this wrapper:
>>> w = Whisper('medium')
Downloading ggml-medium.bin...
whisper_init_from_file_no_state: loading model from '/home/chris/.ggml-models/ggml-medium.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 1024
whisper_model_load: n_audio_head = 16
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 1024
whisper_model_load: n_text_head = 16
whisper_model_load: n_text_layer = 24
whisper_model_load: n_mels = 80
whisper_model_load: f16 = 1
whisper_model_load: type = 4
whisper_model_load: mem required = 1725.00 MB (+ 43.00 MB per decoder)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx = 1462.35 MB
Illegal instruction (core dumped)
Am I doing something wrong?
Seems to crash (ends program and Python interpreter) on Windows during transcription (when calling whisper_full
) without any other output or error.
How do you access and set the various parameters passed to Whisper? I see the params
structure in the Whisper class, but it appears to be private and not externally accessible.
>>> from whispercpp import Whisper
>>> w = Whisper('large')
>>> w.params.processors = 4
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'whispercpp.Whisper' object has no attribute 'params'
On a tiny Hetzner ARM instance, a simple pip install .
raises an error :
Building wheels for collected packages: whispercpp
Building wheel for whispercpp (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for whispercpp (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [5 lines of output]
aarch64-linux-gnu-gcc: error: unrecognized command-line option ‘-mavx’
aarch64-linux-gnu-gcc: error: unrecognized command-line option ‘-mavx2’
aarch64-linux-gnu-gcc: error: unrecognized command-line option ‘-mfma’
aarch64-linux-gnu-gcc: error: unrecognized command-line option ‘-mf16c’
error: command '/usr/bin/aarch64-linux-gnu-gcc' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for whispercpp
Hi @stlukey ,
Thanks for your work! Is there a interest from your side to add a license? Without any license its practically not usable for anybody.
Hi,
when I try to install the package (poetry install
) in a poetry project with the following pyproject.toml
:
[tool.poetry]
name = "whispercppy-test"
version = "0.1.0"
description = ""
authors = ["Your Name <[email protected]>"]
readme = "README.md"
packages = [{include = "whispercppy_test"}]
[tool.poetry.dependencies]
python = "^3.10"
whispercpp = {git = "https://github.com/o4dev/whispercpp.py"}
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
I get the following error when building the whispercpp.py wheels:
pux@Hannes-PC:/mnt/c/Programs/whispercppy-test$ poetry install
Installing dependencies from lock file
Warning: poetry.lock is not consistent with pyproject.toml. You may be getting improper dependencies. Run `poetry lock [--no-update]` to fix it.
Package operations: 1 install, 0 updates, 0 removals
• Installing whispercpp (1.0 275783f): Failed
CalledProcessError
Command '['/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/bin/python', '-m', 'pip', 'install', '--use-pep517', '--disable-pip-version-check', '--isolated', '--no-input', '--prefix', '/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10', '--upgrade', '--no-deps', '/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/src/whispercpp.py']' returned non-zero exit status 1.
at /usr/lib/python3.10/subprocess.py:524 in run
520│ # We don't call process.wait() as .__exit__ does that for us.
521│ raise
522│ retcode = process.poll()
523│ if check and retcode:
→ 524│ raise CalledProcessError(retcode, process.args,
525│ output=stdout, stderr=stderr)
526│ return CompletedProcess(process.args, retcode, stdout, stderr)
527│
528│
The following error occurred when trying to handle this error:
EnvCommandError
Command ['/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/bin/python', '-m', 'pip', 'install', '--use-pep517', '--disable-pip-version-check', '--isolated', '--no-input', '--prefix', '/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10', '--upgrade', '--no-deps', '/home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/src/whispercpp.py'] errored with the following return code 1, and output:
Processing /home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/src/whispercpp.py
Installing build dependencies: started
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'done'
Preparing metadata (pyproject.toml): started
Preparing metadata (pyproject.toml): finished with status 'done'
Building wheels for collected packages: whispercpp
Building wheel for whispercpp (pyproject.toml): started
Building wheel for whispercpp (pyproject.toml): finished with status 'error'
error: subprocess-exited-with-error
× Building wheel for whispercpp (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [13 lines of output]
./whisper.cpp/ggml.c: In function ‘ggml_time_ms’:
./whisper.cpp/ggml.c:269:5: warning: implicit declaration of function ‘clock_gettime’ [-Wimplicit-function-declaration]
269 | clock_gettime(CLOCK_MONOTONIC, &ts);
| ^~~~~~~~~~~~~
./whisper.cpp/ggml.c:269:19: error: ‘CLOCK_MONOTONIC’ undeclared (first use in this function)
269 | clock_gettime(CLOCK_MONOTONIC, &ts);
| ^~~~~~~~~~~~~~~
./whisper.cpp/ggml.c:269:19: note: each undeclared identifier is reported only once for each function it appears in
./whisper.cpp/ggml.c: In function ‘ggml_time_us’:
./whisper.cpp/ggml.c:275:19: error: ‘CLOCK_MONOTONIC’ undeclared (first use in this function)
275 | clock_gettime(CLOCK_MONOTONIC, &ts);
| ^~~~~~~~~~~~~~~
error: command '/usr/bin/x86_64-linux-gnu-gcc' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for whispercpp
Failed to build whispercpp
ERROR: Could not build wheels for whispercpp, which is required to install pyproject.toml-based projects
at ~/.local/share/pypoetry/venv/lib/python3.10/site-packages/poetry/utils/env.py:1540 in _run
1536│ output = subprocess.check_output(
1537│ command, stderr=subprocess.STDOUT, env=env, **kwargs
1538│ )
1539│ except CalledProcessError as e:
→ 1540│ raise EnvCommandError(e, input=input_)
1541│
1542│ return decode(output)
1543│
1544│ def execute(self, bin: str, *args: str, **kwargs: Any) -> int:
The following error occurred when trying to handle this error:
PoetryException
Failed to install /home/pux/.cache/pypoetry/virtualenvs/whispercppy-test-yr7ZLSr5-py3.10/src/whispercpp.py
at ~/.local/share/pypoetry/venv/lib/python3.10/site-packages/poetry/utils/pip.py:58 in pip_install
54│
55│ try:
56│ return environment.run_pip(*args)
57│ except EnvCommandError as e:
→ 58│ raise PoetryException(f"Failed to install {path.as_posix()}") from e
59│
(I just tested it with pip, there I get the same error.)
PS: Thanks a lot for all the work you put in this valuable project!
Thanks for writing this. Seems to work well.
However, is there a way to use quantized models with this? I see they aren't included in your MODELS dictionary.
If I hot-patch it like:
from whispercpp import Whisper, MODELS
MODELS['ggml-large-q5_0.bin'] = 'https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-q5_0.bin'
w = Whisper('large-q5_0')
that makes it download the model, but when it tries to load the model, it fails with:
Downloading ggml-large-q5_0.bin...
whisper_init_from_file_no_state: loading model from '/home/chris/.ggml-models/ggml-large-q5_0.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 1280
whisper_model_load: n_text_head = 20
whisper_model_load: n_text_layer = 32
whisper_model_load: n_mels = 80
whisper_model_load: f16 = 1008
whisper_model_load: type = 5
whisper_model_load: mem required = 3342.00 MB (+ 71.00 MB per decoder)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx = 2950.97 MB
��isper_model_load: unknown tensor '0y��q
K.7��eţ�k�ؠ�� �͠[email protected]�D��^#��9]�|���N(f�fm����:�@lc��QwO�oezg{��-!�Ě���' in model file
whisper_init_no_state: failed to load model
It's possible to change the default language? How?
Thank you very much for your work. I want to use this package as speech-to-text to transcribe Spanish. When I change the language in both files to 'es', it seems to understand Spanish, however it transcribes in English. For example, I say "Cómo va la vida?" and it transcribes "How is life?" which is the correct translation of the phrase in English (so it understands what I say). Any idea how to change this? I am sure that I want to transcribe in Spanish so I don't need automatic dection of language or anything like that. Do you know how to tweak it so it does just that?
Thank you very much.
when I install it through the pip command it always gives me this error: ` error: subprocess-exited-with-error
× Building wheel for whispercpp (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [1 lines of output]
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for whispercpp
Failed to build whispercpp
ERROR: Could not build wheels for whispercpp, which is required to install pyproject.toml-based projects`
I have also installed the microsoft visual c++ 14 too, but it still does not work.
large
model doesn't work :
>>> w = Whisper('large')
Downloading ggml-large.bin...
whisper_init_from_file_no_state: loading model from '/Users/mlecarme/.ggml-models/ggml-large.bin'
whisper_model_load: loading model
whisper_model_load: invalid model data (bad magic)
whisper_init_no_state: failed to load model
You have to pick v1, v2 or v3.
Transcribing..
[New Thread 0x7f040122d700 (LWP 226)]
[New Thread 0x7f0401a2e700 (LWP 227)]
Thread 33 "python" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f040122d700 (LWP 226)]
0x00007f044d9c8f41 in <lambda(int)>::operator() (ith=<optimized out>, __closure=0x1e3b430)
at ./whisper.cpp/whisper.cpp:2089
2089 ./whisper.cpp/whisper.cpp: Is a directory.
Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.5.8-3.amzn2.0.2.x86_64 krb5-libs-1.15.1-37.amzn2.2.2.x86_64 libcom_err-1.42.9-19.amzn2.x86_64 libffi-3.0.13-18.amzn2.0.2.x86_64 libgcc-7.3.1-13.amzn2.x86_64 libselinux-2.5-12.amzn2.0.2.x86_64 libstdc++-7.3.1-13.amzn2.x86_64 openssl-libs-1.0.2k-19.amzn2.0.8.x86_64 pcre-8.32-17.amzn2.0.2.x86_64 zlib-1.2.7-18.amzn2.x86_64
(gdb) where
#0 0x00007f044d9c8f41 in <lambda(int)>::operator() (ith=<optimized out>, __closure=0x1e3b430)
at ./whisper.cpp/whisper.cpp:2089
#1 std::__invoke_impl<void, log_mel_spectrogram(float const*, int, int, int, int, int, int, const whisper_filters&, bool, whisper_mel&)::<lambda(int)>, int> (__f=...) at /usr/include/c++/7/bits/invoke.h:60
#2 std::__invoke<log_mel_spectrogram(float const*, int, int, int, int, int, int, const whisper_filters&, bool, whisper_mel&)::<lambda(int)>, int> (__fn=...) at /usr/include/c++/7/bits/invoke.h:95
#3 std::thread::_Invoker<std::tuple<log_mel_spectrogram(float const*, int, int, int, int, int, int, const whisper_filters&, bool, whisper_mel&)::<lambda(int)>, int> >::_M_invoke<0, 1> (this=0x1e3b428) at /usr/include/c++/7/thread:234
#4 std::thread::_Invoker<std::tuple<log_mel_spectrogram(float const*, int, int, int, int, int, int, const whisper_filters&, bool, whisper_mel&)::<lambda(int)>, int> >::operator() (this=0x1e3b428) at /usr/include/c++/7/thread:243
#5 std::thread::_State_impl<std::thread::_Invoker<std::tuple<log_mel_spectrogram(float const*, int, int, int, int, int, int, const whisper_filters&, bool, whisper_mel&)::<lambda(int)>, int> > >::_M_run(void) (this=0x1e3b420)
at /usr/include/c++/7/thread:186
#6 0x00007f044d6c2acf in ?? () from /lib64/libstdc++.so.6
#7 0x00007f045531440b in start_thread () from /lib64/libpthread.
when run the command pip install git+https://github.com/stlukey/whispercpp.py
command '/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_810lo85vyi/croot/python-split_1678271120546/_build_env/bin/llvm-ar' failed: No such file or directory
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for whispercpp
Failed to build whispercpp
ERROR: Could not build wheels for whispercpp, which is required to install pyproject.toml-based projects
I had refered this link:https://tipseason.com/carbon-language-execvp-error/, but can not work.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.