sskorol / vosk-api-gpu Goto Github PK

View Code? Open in Web Editor NEW

37.0 1.0 8.0 362 KB

Vosk ASR Docker images with GPU for Jetson boards, PCs, M1 laptops and GPC

License: Apache License 2.0

Dockerfile 37.55% Shell 52.37% Python 4.13% HCL 5.95%

vosk asr cuda jetson jetson-nano jetson-xavier-nx vosk-api nvidia nvidia-docker docker

vosk-api-gpu's Introduction

Vosk API - Docker/GPU

Vosk docker images with GPU for Jetson Nano / Xavier NX boards and PCs with NVIDIA cards.

Usage

Pull an existing image with a required tag.

docker pull sskorol/vosk-api:$TAG

Use it as a baseline in your app's Dockerfile:

FROM sskorol/vosk-api:$TAG

Build prerequisites

You have to enable nvidia runtime before building the images.
In the case of Jetson boards, your JetPack should match at least 32.5 version (0.3.32 images were built against 32.6.1).
For PCs make sure you met the following prerequisites.

Building

Clone sources and check a build file help:

git clone https://github.com/sskorol/vosk-api-gpu.git
cd vosk-api-gpu

Jetson boards

Run a build script with the required args depending on your platform, e.g.:

cd jetson && ./build.sh -m nano -i ml -t 0.3.37

You can check the available NVIDIA base image tags here and here.

PCs

To build images for PC, use the following script:

cd pc && ./build.sh -c 11.3.1-devel-ubuntu20.04 -t 0.3.37

Here, you have to provide a base cuda image tag and the output container's tag. You can read more by running the script with -h flag.

This script will build 2 images: base and a sample Vosk server.

Windows 11 with WSL2

Follow the official instructions to install Docker Desktop.
Make sure you fully accomplished the GPU part of the above guide.
Either use an existing image or build a new one following PCs part of this README.
Follow the Running part of this README to test your recording.

Apple M1

To build images (w/o GPU) for Apple M1, use the following script:

cd m1 && ./build.sh -t 0.3.37

To build Kaldi and Vosk API locally (w/o Docker), use the following script (thanks to @aivoicesystems):

cd m1 && ./build-local.sh

Note that there's a required software check when you start this script. If you see missing requirements, chances are you'll need to install the following packages:

brew install autoconf cmake automake libtool

Also note that this script installs Vosk API globally. If you want to use it for a specific project, just activate virtual env before running the script.

GCP

To test images on GCP with NVIDIA Tesla T4, use the following steps:

Install terraform
Create a new project on GCP
Install and init gcloud-cli
Deploy a new Compute Engine instance with the following commands:

cd gcp && terraform init && terraform apply

Note that you'll be prompted to type your GCP project name and zone. When a new instance is deployed, you can now ssh to it:

gcloud compute ssh --project $PROJECT_NAME --zone $ZONE gpu

Clone this repo and build Vosk images on a relatively powerful machine:

git clone https://github.com/sskorol/vosk-api-gpu && cd vosk-api-gpu/gcp
./build.sh -c 11.3.1-devel-ubuntu20.04 -t 0.3.37 -m vosk-model-en-us-0.22

See build script's help for more details regarding input arguments.

Running

The following script will start docker-compose, install requirements and run a simple test:

./test.sh $TAG $WAV_FILE

Pass a newly built image tag as an argument.
You have to download and extract a required model into ./model folder first, unless you use a GCP instance.
Pass your own recording as a second argument. en.wav is used by default.

Important notes

Jetson Nano won't work with the latest large model due to high memory requirements (at least 8Gb RAM).
Jetson Xavier will work with the latest large model if you remove rnnlm folder from model.
Make sure you have at least Docker (20.10.6) and Compose (1.29.1) versions.
Your host's CUDA version must match the container's as they share the same runtime. Jetson images were built with CUDA 10.1. As per the desktop version: CUDA 11.3.1 was used.
If you plan to use rnnlm, make sure you allocated at least 12Gb of RAM to your Docker instance (16Gb is optimal).
In case of GCP usage, there's a know issue with K80 instance. Seems like it has an outdated architecture. So it's recommended to take at least NVIDIA T4.
Not all the models are adopted for GPU-usage, e.g. in RU model, you have to manually patch configs to make it work (it's done automatically for GCP instance):
- remove min-active flag from model/conf/model.conf
- copy/paste ivector.conf from big EN model

vosk-api-gpu's People

Contributors

Stargazers

Watchers

Forkers

eldar7 format37 madkote zxl777 match08 xpcom-bsd ca4ti silvioprog

vosk-api-gpu's Issues

Do Dynamic Models work on the GPU

Hi, I'm really interested in this project, but I haven't had a chance to play with it.

I wanted to ask a question regarding Vosk's dynamic/ static models. As I understand it, most of the models on https://alphacephei.com/vosk/models are big models that are compiled statically, meaning that in order to expand the vocabulary of words detected, for example, to add "Github" (assuming that's not already something it can output), would require recompiling to modify the dictionary and language model and have "Github" successfully detected and returned as an output. There are also models with a dynamic graph, which allow online modification of the vocabulary without the time-expensive recompiling.

I'm interested in knowing if you've gotten models with dynamic graphs working on the GPU, and if they also benefit from performance improvements. Any insight would be appreciated.

Thanks!
Joshua Rosales

Optimize build scripts

Some scripts currently use hardcoded values. To increase flexibility, it's required to make them more generic.

Sync Vosk with upstream

There was a bunch of updates made in the original repo related to GPU decoding. Need to sync build files and test scripts.

Problem building 0.3.37-m1 in jetson nano

First I enabled nvidia runtime in my jetson nano,
checked the version of my jetpack:

Package: nvidia-jetpack
Version: 4.6.2-b5
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-cuda (= 4.6.2-b5), nvidia-opencv (= 4.6.2-b5), nvidia-cudnn8 (= 4.6.2-b5), nvidia-tensorrt (= 4.6.2-b5), nvidia-visionworks (= 4.6.2-b5), nvidia-container (= 4.6.2-b5), nvidia-vpi (= 4.6.2-b5), nvidia-l4t-jetson-multimedia-api (>> 32.7-0), nvidia-l4t-jetson-multimedia-api (<< 32.8-0)
Homepage: http://developer.nvidia.com/jetson
Priority: standard
Section: metapackages
Filename: pool/main/n/nvidia-jetpack/nvidia-jetpack_4.6.2-b5_arm64.deb
Size: 29378
SHA256: 925f4abff97e6024d86cff3b9e132e7c7554d05fb83590487381b7e925d5b2bb
SHA1: e3ef727e87df5c331aece34508c110d57d744fe9
MD5sum: 7cb2e387af41bc8143ac7b6525af7794
Description: NVIDIA Jetpack Meta Package
Description-md5: ad1462289bdbc54909ae109d1d32c0a8

Added my user to the docker group

sudo groupadd docker
sudo usermod -aG docker ${USER}

then proceeded to pull the image for arm64:

docker pull sskorol/vosk-api:0.3.37-m1

Changed the Dockerfile to use the proper tag

ARG L4T_VERSION=r32.6.1

FROM sskorol/vosk-api:0.3.37-m1

Then proceeded to use the command:

cd jetson && ./build.sh -m nano -i ml -t 0.3.37

But in step 14 or 15 I had this problem:
It freezes in:

c++ -std=c++17 -I.. -isystem /opt/kaldi/tools/openfst-1.8.0/include -O3 -march=armv8-a -Wno-sign-compare -Wall -Wno-sign-comare -Wno-unused-local-typedefs -Wno-deprecated-declarations -Winit-self -DKALDI _DOUBLEPRECISION=0 -DHAVE_EXECINFO_H=1 -DHAVE_CXXABI_H -DHAVE_OPENBLAS -I/opt/kaldi/tools/OpenBLAS/install/include -ftree-vectorize -pthread -g -fPIC -DUSE_KALDI_SVD -c -o word-allign-lattice.o word-allign-lattice.cc

After 1 hour or so it comes back and says:

c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[1]: *** [<builtin>: lattice-functions-transition-model.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make[1]: Leaving directory '/opt/kaldi/src/lat'
make: *** [Makefile:147: lat] Error 2
The command '/bin/sh -c cd /opt/kaldi/src &&   ./configure --mathlib=OPENBLAS_CLAPACK --shared --use-cuda &&  sed -i "s: -01 : -03 -march=$ARCH :g" kaldi.mk && echo "[BUILDING KALDI] -j $(nproc) online2 lm rnnlm cudafeat cudadecoder" returned a non-zero code: 2
unable to prepare context: unable to evaluate symlinks in Dockerfile path: lstat /home/jetson/projects/vosk-api-gpu/Dockerfile.server: no such file or directory

Im currently using a jetson nano that has 4GB RAM

OSError: Multiple exceptions

All the instructions were executed successfully but when I tried running the code, the following error occurred. What should I do?

Traceback (most recent call last): File "./test.py", line 28, in <module> run_test('ws://localhost:2700')) File "/opt/conda/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete return future.result() File "./test.py", line 10, in run_test async with websockets.connect(uri) as websocket: File "/home/raghavendra.jain/vosk-api-gpu/.venv/lib/python3.7/site-packages/websockets/legacy/client.py", line 633, in __aenter__ return await self File "/home/raghavendra.jain/vosk-api-gpu/.venv/lib/python3.7/site-packages/websockets/legacy/client.py", line 650, in __await_impl_timeout__ return await asyncio.wait_for(self.__await_impl__(), self.open_timeout) File "/opt/conda/lib/python3.7/asyncio/tasks.py", line 442, in wait_for return fut.result() File "/home/raghavendra.jain/vosk-api-gpu/.venv/lib/python3.7/site-packages/websockets/legacy/client.py", line 654, in __await_impl__ transport, protocol = await self._create_connection() File "/opt/conda/lib/python3.7/asyncio/base_events.py", line 971, in create_connection ', '.join(str(exc) for exc in exceptions))) OSError: Multiple exceptions: [Errno 111] Connect call failed ('::1', 2700, 0, 0), [Errno 111] Connect call failed ('127.0.0.1', 2700)

Add GCP support

To build and test Vosk with GPU on a relatively powerful machine in a cloud, it's required to prepare a special build script and terraform template.
As the number of configurations grows, it'd make sense to restructure the project.

No CUDA GPU detected!

Hi,
I tried to run vosk server on GPU but I encountered this error:
No CUDA GPU detected!

I have a GeForce GTX 1080 GPU and this is the output of nvidia-smi command:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.29.05    Driver Version: 495.29.05    CUDA Version: 11.5     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:02:00.0 Off |                  N/A |
|  0%   45C    P8    11W / 275W |     36MiB / 11177MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A       947      G   /usr/lib/xorg/Xorg                 26MiB |
|    0   N/A  N/A      1215      G   /usr/bin/gnome-shell                7MiB |
+-----------------------------------------------------------------------------+

This is my steps:

I pulled the PC docker image using this command:

docker pull sskorol/vosk-api:0.3.30-pc

I built the server image via this command:

sudo docker build -f Dockerfile.server --no-cache -t sskorol/vosk-server:0.3.30-pc --build-arg TAG=0.3.30-pc .
this is my docker images:

sudo docker images
REPOSITORY                              TAG         IMAGE ID       CREATED          SIZE
sskorol/vosk-server                     0.3.30-pc   23cec0dafac5   20 minutes ago   9.86GB
sskorol/vosk-api                        0.3.30-pc   76dd2567cd01   2 months ago     9.85GB

I ran the server image, using this command:

sudo docker run -it 23cec0dafac5

and at this step I encountered this Error:

ERROR ([5.5.873~1527-75ec]:SelectGpuId():cu-device.cc:185) No CUDA GPU detected!, diagnostics: cudaError_t 35 : "CUDA driver version is insufficient for CUDA runtime version", in cu-device.cc:185

[ Stack-Trace: ]
/usr/local/lib/python3.8/dist-packages/vosk-0.3.30-py3.8.egg/vosk/libvosk.so(kaldi::MessageLogger::LogMessage() const+0x7fe) [0x7f39219c921e]
/usr/local/lib/python3.8/dist-packages/vosk-0.3.30-py3.8.egg/vosk/libvosk.so(+0x66fc83) [0x7f392186ec83]
/usr/local/lib/python3.8/dist-packages/vosk-0.3.30-py3.8.egg/vosk/libvosk.so(kaldi::CuDevice::SelectGpuId(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)+0xe28) [0x7f3921875218]
/usr/local/lib/python3.8/dist-packages/vosk-0.3.30-py3.8.egg/vosk/libvosk.so(vosk_gpu_init+0x7c) [0x7f392158920c]
/usr/lib/x86_64-linux-gnu/libffi.so.7(+0x6ff5) [0x7f3924228ff5]
/usr/lib/x86_64-linux-gnu/libffi.so.7(+0x640a) [0x7f392422840a]
/usr/lib/python3/dist-packages/_cffi_backend.cpython-38-x86_64-linux-gnu.so(+0x1afd7) [0x7f3924248fd7]
python3(_PyObject_MakeTpCall+0x150) [0x5f3010]
python3(_PyEval_EvalFrameDefault+0x5d43) [0x5700f3]
python3(_PyFunction_Vectorcall+0x1b6) [0x5f5956]
python3(_PyEval_EvalFrameDefault+0x72f) [0x56aadf]
python3(_PyFunction_Vectorcall+0x1b6) [0x5f5956]
python3(_PyEval_EvalFrameDefault+0x72f) [0x56aadf]
python3(_PyEval_EvalCodeWithName+0x26a) [0x568d9a]
python3(PyEval_EvalCode+0x27) [0x68cdc7]
python3() [0x67e161]
python3() [0x67e1df]
python3() [0x67e281]
python3(PyRun_SimpleFileExFlags+0x197) [0x67e627]
python3(Py_RunMain+0x212) [0x6b6e62]
python3(Py_BytesMain+0x2d) [0x6b71ed]
/usr/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f3924da40b3]
python3(_start+0x2e) [0x5f96de]

terminate called after throwing an instance of 'kaldi::KaldiFatalError'
  what():  kaldi::KaldiFatalError

What is the problem?

Thanks.

Only 0.22-en Model seems to work with this

When trying any other model other than the 0.22 I receive errors, example:

vosk-api-gpu-vosk-1  | Command line was: 

vosk-api-gpu-vosk-1  | ERROR ([5.5.1041~1-098ee]:ReadConfigFile():parse-options.cc:493) Invalid option --min-active=200 in config file model/conf/model.conf

Can't run test.py on AGX Xavier

Hi, first of all, thank you for making this open-source, it is really helpful. I recently followed the steps to build vosk with GPU support on my Nvidia AGX Xavier, but after running ./test.py weather.wav I run into this error:

Traceback (most recent call last):
File "./test.py", line 23, in
run_test('ws://localhost:2700'))
File "/usr/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
return future.result()
File "./test.py", line 8, in run_test
async with websockets.connect(uri) as websocket:
File "/home/xbio/.local/lib/python3.6/site-packages/websockets/legacy/client.py", line 604, in aenter
return await self
File "/home/xbio/.local/lib/python3.6/site-packages/websockets/legacy/client.py", line 622, in await_impl
transport, protocol = await self._create_connection()
File "/usr/lib/python3.6/asyncio/base_events.py", line 794, in create_connection
raise exceptions[0]
File "/usr/lib/python3.6/asyncio/base_events.py", line 781, in create_connection
yield from self.sock_connect(sock, address)
File "/usr/lib/python3.6/asyncio/selector_events.py", line 439, in sock_connect
return (yield from fut)
File "/usr/lib/python3.6/asyncio/selector_events.py", line 469, in _sock_connect_cb
raise OSError(err, 'Connect call failed %s' % (address,))
ConnectionRefusedError: [Errno 111] Connect call failed ('127.0.0.1', 2700)

About my system:
OS: Linux 18.04
CUDA Version: 10.2

I used the docker pull sskorol/vosk-api:0.3.30-xavier command to pull the image.
After that, I used ./build.sh -m xavier -i base -l r32.6.1 -t 0.3.30
and installed the dependencies: pip3 install websockets asyncio

Everything until there was fine, I hit the problem when trying to run the test, any comments on this?

I'm new to Vosk and all the help would be really appreciated. I'm really trying to figure out if I can implement speech recognition faster into my code. My idea is to use a camera and control it using a USB microphone. I already achieved this using Vosk (which I built normally, not using GPU), the problem is that it slowed down my code so much that I couldn't even read frames from my camera anymore.

I would also appreciate it if you could provide some guidance on how can I implement your docker images/code into my python code and if needed I can provide the script I'm using right now for speech recognition.

Thanks!

Add Apple M1 support

Some users experience issues using Vosk on M1. It's required to provide such support at least w/o GPU atm.

Enhance mac m1 build

Seems like there are optimal compiler flags to be used while local m1 build. It's required to test and push these improvements.

Image can not work on WSL2

WSL2 = Windows Subsystem for Linux,
Now WSL2 is basically the same as ubuntu and can also run docker.
I loaded the image of https://github.com/sskorol/vosk-api-gpu on WSL2, but it said that the GPU could not be found.

I was successful in running nvidia's docker container on this platform.
Refer to this URL: https://docs.nvidia.com/cuda/wsl-user-guide/index.html
This shows that the GPU driver and the docker environment can work.

The report below was run under WSL2 HOST, not in a container.

uname -a
Linux DESKTOP-S7O3QUS 5.10.102.1-microsoft-standard-WSL2 #1 SMP Wed Mar 2 00:30:59 UTC 2022 x86_64 GNU/Linux

/usr/local/cuda/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_Mar__8_18:18:20_PST_2022
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0

/usr/lib/wsl/lib/nvidia-smi
Thu Apr 14 14:12:46 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.60.02    Driver Version: 512.15       CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0  On |                  N/A |
| 23%   36C    P0    49W / 307W |    443MiB /  8192MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Python 3.9.2 (default, Feb 28 2021, 17:03:44) 
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from vosk import BatchModel, BatchRecognizer, GpuInit
>>> GpuInit()
>>> model = BatchModel()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/dist-packages/vosk/__init__.py", line 117, in __init__
    raise Exception("Failed to create a model")
Exception: Failed to create a model
>>>

Running with a small model

Hello, I tried to run service with vosk-model-small-ru-0.22. Docker-compose starts successfully without errors, but in ~10 seconds it crash with status 139. If I use a big model - everything is ok.
I'm using -c 10.2-cudnn7-devel-centos7 -t 0.3.37 options.
Have your ever catch this problem? Or maybe there are some instructions to run with small models?
Thnx!

install gpu module succesfully but still run on cpu

Running without docker

Hi, I am not much familiar with docker would like to build it on my desktop. I am using k80 on GCP's vm instance. Please suggest which file I should follow. thanks a lot!