The vosk-cli from elan-ev

More robust model location autodetection?

Lines 249 to 252 in 81b9f1c

 # get all available models if we got the special value auto 

 if try_models == ['auto']: 

 try_models = glob('./models/*') + glob('/usr/share/vosk/models/*') 

 try_models = [model for model in try_models if os.path.isdir(model)]

vosk-cli/voskcli/transcribe.py

Lines 337 to 340 in 81b9f1c

 # Try finding a matching module 

 modules = glob(f'/usr/share/vosk/models/*-{lang}-*') \ 

 or glob(f'./models/*-{lang}-*') 

 modules = [model for model in modules if os.path.isdir(model)]

Searching in $XDG_DATA_DIRS would be nice. (Spec.) You don't always want to install models system-wide.

As it's not always set, it should also look through default values of /usr/share/ (as currently), /usr/local/share/, and $HOME/.local/share/.

In addition to exposing manually installed models in non-root locations, this would also allow automatic use of models installed via E.G. Flatpak or Nix, which set $XDG_DATA_DIRS.

Also, from the AUR, the share subdirectory seems to be vosk-models, not vosk/models:

$ pacman -Ql vosk-api-bin
vosk-api-bin /usr/
vosk-api-bin /usr/include/
vosk-api-bin /usr/include/vosk_api.h
vosk-api-bin /usr/lib/
vosk-api-bin /usr/lib/libvosk.so
vosk-api-bin /usr/local/
vosk-api-bin /usr/local/share/
vosk-api-bin /usr/local/share/vosk-models/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/README
vosk-api-bin /usr/local/share/vosk-models/small-en-us/am/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/am/final.mdl
vosk-api-bin /usr/local/share/vosk-models/small-en-us/conf/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/conf/mfcc.conf
vosk-api-bin /usr/local/share/vosk-models/small-en-us/conf/model.conf
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/Gr.fst
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/HCLr.fst
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/disambig_tid.int
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/phones/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/phones/word_boundary.int
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/final.dubm
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/final.ie
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/final.mat
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/global_cmvn.stats
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/online_cmvn.conf
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/splice.conf

But that may be a packaging issue in the vosk-api-bin package, as vosk-api does use vosk/models.

What is the canonical place to put models? Do the docs give a recommendation?
Windows/Non-LSB/Non-FHS OS's?

Just a thought/convenience that might be nice to have.

Running bandit

I decided to run bandit to check for potential security issues are proposed in #10.

The only thing bandit found was the usage of subprocess to run ffmpeg, although it rates the severity as "low" since subprocess.Popen is already used in fairly safe way (i.e. not spawning a command shell). So the only issue i can really see here is that the ffmpeg command accepts an arbitrary input file.

One way to "fix" the arbitrary input file issue would be to avoid running ffmpeg and instead have the user provide a valid file format, but I assume we don't want that.

Then the other thing to do is just general safety measures, like sandboxing, limiting process resources and limiting the input file to well known formats. Not sure if any of these are "worth it" though, that would be up for discussion.

Bandit logs

Test results:
>> Issue: [B404:blacklist] Consider possible security implications associated with the subprocess module.
   Severity: Low   Confidence: High
   CWE: CWE-78 (https://cwe.mitre.org/data/definitions/78.html)
   Location: voskcli/transcribe.py:21:0
   More Info: https://bandit.readthedocs.io/en/1.7.4/blacklists/blacklist_imports.html#b404-import-subprocess
20	import os
21	import subprocess
22	import json

--------------------------------------------------
>> Issue: [B603:subprocess_without_shell_equals_true] subprocess call - check for execution of untrusted input.
   Severity: Low   Confidence: High
   CWE: CWE-78 (https://cwe.mitre.org/data/definitions/78.html)
   Location: voskcli/transcribe.py:189:14
   More Info: https://bandit.readthedocs.io/en/1.7.4/plugins/b603_subprocess_without_shell_equals_true.html
188	               '-ar', str(sample_rate), '-ac', '1', '-f', 's16le', '-']
189	    process = subprocess.Popen(command, stdout=subprocess.PIPE)
190	

--------------------------------------------------

Code scanned:
	Total lines of code: 195
	Total lines skipped (#nosec): 0

Run metrics:
	Total issues (by severity):
		Undefined: 0
		Low: 2
		Medium: 0
		High: 0
	Total issues (by confidence):
		Undefined: 0
		Low: 0
		Medium: 0
		High: 2

Collection of tasks

These are a bunch of smaller issues and possible improvements. If you want to work on one of these, please leave a comment and open an issue or PR, which I will link to the task below.

Deployment
- Packaging: Create a distribution package for vosk-cli. See #12
- Use Bandit to find possible security issues #16
- Deployment to PyPI: Create entry for vosk-cli in PyPI and configure automatic deployment. See #13
- Include language model(s) in some form
Usability/Documentation
- Revise README with updated instructions on how to run vosk-cli. See #14
- Extend README with specific installation instructions for different operating systems (if needed). See #14
- Display detailed help when vosk-cli call is missing parameters
- Create technical documentation (potentially larger task)
Features
- Display average confidence coefficient for transcriptions. See #15
- Enable the use of punctuation models: Collaborate on/contribute to #9
- Language recognition: Use multiple models and automatically detect spoken language based on confidence coefficient (this might be a larger task)

Create distribution package

A distribution package for vosk-cli would simplify installation/deployment and would also enable us to publish to PyPI and potentially other package indexes.

Missing Dependencies

In testing opencast/opencast#3806, I installed vosk-cli per the current README. Functional testing with an Opencast workflow spat out this error:

2022-06-03 12:25:22,763 | ERROR | (AbstractJobProducer$JobRunner:343) - Error handling operation 'speechtotext':                                              
org.opencastproject.speechtotext.api.SpeechToTextServiceException: Error while generating subtitle from http://localhost:8080/files/mediapackage/a378a5fe-9cfd
-45e0-a9f0-69cdadbfbdb6/d5d24af2-d40c-4f39-b56f-91d81b5b9a0c/nonsegment_audio.mpg                                                                             
        at org.opencastproject.speechtotext.impl.SpeechToTextServiceImpl.process(SpeechToTextServiceImpl.java:166) ~[?:?]                                     
        at org.opencastproject.job.api.AbstractJobProducer$JobRunner.call(AbstractJobProducer.java:313) [!/:?]                                                
        at org.opencastproject.job.api.AbstractJobProducer$JobRunner.call(AbstractJobProducer.java:272) [!/:?]                                                
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]                                                                                     
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]                                                              
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]                                                              
        at java.lang.Thread.run(Thread.java:829) [?:?]                                                                                                        
Caused by: org.opencastproject.speechtotext.api.SpeechToTextEngineException: org.opencastproject.speechtotext.api.SpeechToTextEngineException: Vosk exited abn
ormally with status 1 (command: [vosk-cli, -i, /home/greg/opencast/upstream/build/opencast-dist-allinone/data/opencast/workspace/mediapackage/a378a5fe-9cfd-45
e0-a9f0-69cdadbfbdb6/d5d24af2-d40c-4f39-b56f-91d81b5b9a0c/nonsegment_audio.mpg, -o, /home/greg/opencast/upstream/build/opencast-dist-allinone/data/opencast/wo
rkspace/collection/subtitles/tmp_1773_nonsegment_audio.vtt, -l, ara])                                                                                         
 Output:                                                                                                                                                      
Traceback (most recent call last):                                                                                                                            
  File "/home/greg/.local/bin/vosk-cli", line 5, in <module>                                                                                                  
    from scripts.transcribe import main                                                                                                                       
ModuleNotFoundError: No module named 'scripts'                                                                                                                
                                                                                                                                                              
        at org.opencastproject.speechtotext.impl.engine.VoskEngine.generateSubtitlesFile(VoskEngine.java:123) ~[?:?]                                          
        at org.opencastproject.speechtotext.impl.SpeechToTextServiceImpl.process(SpeechToTextServiceImpl.java:156) ~[?:?]                                     
        ... 6 more                                                                                                                                            
Caused by: org.opencastproject.speechtotext.api.SpeechToTextEngineException: Vosk exited abnormally with status 1 (command: [vosk-cli, -i, /home/greg/opencast
/upstream/build/opencast-dist-allinone/data/opencast/workspace/mediapackage/a378a5fe-9cfd-45e0-a9f0-69cdadbfbdb6/d5d24af2-d40c-4f39-b56f-91d81b5b9a0c/nonsegme
nt_audio.mpg, -o, /home/greg/opencast/upstream/build/opencast-dist-allinone/data/opencast/workspace/collection/subtitles/tmp_1773_nonsegment_audio.vtt, -l, ar
a])                                                                                                                                                           
 Output:                                                                                                                                                      
Traceback (most recent call last):                                                                                                                            
  File "/home/greg/.local/bin/vosk-cli", line 5, in <module>                                                                                                  
    from scripts.transcribe import main                                                                                                                       
ModuleNotFoundError: No module named 'scripts'                                                                                                                
                                                                                                                                                              
        at org.opencastproject.speechtotext.impl.engine.VoskEngine.generateSubtitlesFile(VoskEngine.java:115) ~[?:?]                                          
        at org.opencastproject.speechtotext.impl.SpeechToTextServiceImpl.process(SpeechToTextServiceImpl.java:156) ~[?:?]                                     
        ... 6 more                                                                                                                                            
2022-06-03 12:25:25,656 | ERROR | (WorkflowOperationWorker:140) - Workflow operation 'operation:'speechtotext, state:'FAILED'' failed                         
org.opencastproject.workflow.api.WorkflowOperationException: Speech-to-Text job for media package 'a378a5fe-9cfd-45e0-a9f0-69cdadbfbdb6' failed
        at org.opencastproject.workflow.handler.speechtotext.SpeechToTextWorkflowOperationHandler.createSubtitle(SpeechToTextWorkflowOperationHandler.java:181) ~[?:?]
        at org.opencastproject.workflow.handler.speechtotext.SpeechToTextWorkflowOperationHandler.start(SpeechToTextWorkflowOperationHandler.java:146) ~[?:?]
        at org.opencastproject.workflow.impl.WorkflowOperationWorker.start(WorkflowOperationWorker.java:212) ~[!/:?]
        at org.opencastproject.workflow.impl.WorkflowOperationWorker.execute(WorkflowOperationWorker.java:117) [!/:?]
        at org.opencastproject.workflow.impl.WorkflowServiceImpl.runWorkflowOperation(WorkflowServiceImpl.java:719) [!/:?]
        at org.opencastproject.workflow.impl.WorkflowServiceImpl.process(WorkflowServiceImpl.java:1736) [!/:?]
        at org.opencastproject.workflow.impl.WorkflowServiceImpl$JobRunner.call(WorkflowServiceImpl.java:2097) [!/:?]
        at org.opencastproject.workflow.impl.WorkflowServiceImpl$JobRunner.call(WorkflowServiceImpl.java:2063) [!/:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:829) [?:?]

Installing scripts with pip install scripts does not resolve the issue.

Absolute path check will fail if the input path includes double slashes (empty components, `'//'`), which can happen in scripts.

vosk-cli/voskcli/transcribe.py

Lines 81 to 83 in 81b9f1c

 # Do we have an absolute path to a directory? 

 absmodel = os.path.abspath(model) 

 if model.startswith(absmodel):

	# get all available models if we got the special value auto
	if try_models == ['auto']:
	try_models = glob('./models/') + glob('/usr/share/vosk/models/')
	try_models = [model for model in try_models if os.path.isdir(model)]

	# Try finding a matching module
	modules = glob(f'/usr/share/vosk/models/-{lang}-') \
	or glob(f'./models/-{lang}-')
	modules = [model for model in modules if os.path.isdir(model)]

	# Do we have an absolute path to a directory?
	absmodel = os.path.abspath(model)
	if model.startswith(absmodel):

elan-ev / vosk-cli Goto Github PK

vosk-cli's Introduction

vosk-cli

Installation

1. Install vosk-cli

2. Install dependencies

3. Download the language model

Usage

vosk-cli's People

Contributors

Stargazers

Watchers

Forkers

vosk-cli's Issues

Recommend Projects

Recommend Topics

Recommend Org