elan-ev / vosk-cli Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
vosk-cli/voskcli/transcribe.py
Lines 81 to 83 in 81b9f1c
These are a bunch of smaller issues and possible improvements. If you want to work on one of these, please leave a comment and open an issue or PR, which I will link to the task below.
A distribution package for vosk-cli would simplify installation/deployment and would also enable us to publish to PyPI and potentially other package indexes.
In testing opencast/opencast#3806, I installed vosk-cli per the current README. Functional testing with an Opencast workflow spat out this error:
2022-06-03 12:25:22,763 | ERROR | (AbstractJobProducer$JobRunner:343) - Error handling operation 'speechtotext':
org.opencastproject.speechtotext.api.SpeechToTextServiceException: Error while generating subtitle from http://localhost:8080/files/mediapackage/a378a5fe-9cfd
-45e0-a9f0-69cdadbfbdb6/d5d24af2-d40c-4f39-b56f-91d81b5b9a0c/nonsegment_audio.mpg
at org.opencastproject.speechtotext.impl.SpeechToTextServiceImpl.process(SpeechToTextServiceImpl.java:166) ~[?:?]
at org.opencastproject.job.api.AbstractJobProducer$JobRunner.call(AbstractJobProducer.java:313) [!/:?]
at org.opencastproject.job.api.AbstractJobProducer$JobRunner.call(AbstractJobProducer.java:272) [!/:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: org.opencastproject.speechtotext.api.SpeechToTextEngineException: org.opencastproject.speechtotext.api.SpeechToTextEngineException: Vosk exited abn
ormally with status 1 (command: [vosk-cli, -i, /home/greg/opencast/upstream/build/opencast-dist-allinone/data/opencast/workspace/mediapackage/a378a5fe-9cfd-45
e0-a9f0-69cdadbfbdb6/d5d24af2-d40c-4f39-b56f-91d81b5b9a0c/nonsegment_audio.mpg, -o, /home/greg/opencast/upstream/build/opencast-dist-allinone/data/opencast/wo
rkspace/collection/subtitles/tmp_1773_nonsegment_audio.vtt, -l, ara])
Output:
Traceback (most recent call last):
File "/home/greg/.local/bin/vosk-cli", line 5, in <module>
from scripts.transcribe import main
ModuleNotFoundError: No module named 'scripts'
at org.opencastproject.speechtotext.impl.engine.VoskEngine.generateSubtitlesFile(VoskEngine.java:123) ~[?:?]
at org.opencastproject.speechtotext.impl.SpeechToTextServiceImpl.process(SpeechToTextServiceImpl.java:156) ~[?:?]
... 6 more
Caused by: org.opencastproject.speechtotext.api.SpeechToTextEngineException: Vosk exited abnormally with status 1 (command: [vosk-cli, -i, /home/greg/opencast
/upstream/build/opencast-dist-allinone/data/opencast/workspace/mediapackage/a378a5fe-9cfd-45e0-a9f0-69cdadbfbdb6/d5d24af2-d40c-4f39-b56f-91d81b5b9a0c/nonsegme
nt_audio.mpg, -o, /home/greg/opencast/upstream/build/opencast-dist-allinone/data/opencast/workspace/collection/subtitles/tmp_1773_nonsegment_audio.vtt, -l, ar
a])
Output:
Traceback (most recent call last):
File "/home/greg/.local/bin/vosk-cli", line 5, in <module>
from scripts.transcribe import main
ModuleNotFoundError: No module named 'scripts'
at org.opencastproject.speechtotext.impl.engine.VoskEngine.generateSubtitlesFile(VoskEngine.java:115) ~[?:?]
at org.opencastproject.speechtotext.impl.SpeechToTextServiceImpl.process(SpeechToTextServiceImpl.java:156) ~[?:?]
... 6 more
2022-06-03 12:25:25,656 | ERROR | (WorkflowOperationWorker:140) - Workflow operation 'operation:'speechtotext, state:'FAILED'' failed
org.opencastproject.workflow.api.WorkflowOperationException: Speech-to-Text job for media package 'a378a5fe-9cfd-45e0-a9f0-69cdadbfbdb6' failed
at org.opencastproject.workflow.handler.speechtotext.SpeechToTextWorkflowOperationHandler.createSubtitle(SpeechToTextWorkflowOperationHandler.java:181) ~[?:?]
at org.opencastproject.workflow.handler.speechtotext.SpeechToTextWorkflowOperationHandler.start(SpeechToTextWorkflowOperationHandler.java:146) ~[?:?]
at org.opencastproject.workflow.impl.WorkflowOperationWorker.start(WorkflowOperationWorker.java:212) ~[!/:?]
at org.opencastproject.workflow.impl.WorkflowOperationWorker.execute(WorkflowOperationWorker.java:117) [!/:?]
at org.opencastproject.workflow.impl.WorkflowServiceImpl.runWorkflowOperation(WorkflowServiceImpl.java:719) [!/:?]
at org.opencastproject.workflow.impl.WorkflowServiceImpl.process(WorkflowServiceImpl.java:1736) [!/:?]
at org.opencastproject.workflow.impl.WorkflowServiceImpl$JobRunner.call(WorkflowServiceImpl.java:2097) [!/:?]
at org.opencastproject.workflow.impl.WorkflowServiceImpl$JobRunner.call(WorkflowServiceImpl.java:2063) [!/:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
Installing scripts
with pip install scripts
does not resolve the issue.
vosk-cli/voskcli/transcribe.py
Lines 249 to 252 in 81b9f1c
vosk-cli/voskcli/transcribe.py
Lines 337 to 340 in 81b9f1c
Searching in $XDG_DATA_DIRS
would be nice. (Spec.) You don't always want to install models system-wide.
As it's not always set, it should also look through default values of /usr/share/
(as currently), /usr/local/share/
, and $HOME/.local/share/
.
In addition to exposing manually installed models in non-root locations, this would also allow automatic use of models installed via E.G. Flatpak or Nix, which set $XDG_DATA_DIRS
.
Also, from the AUR, the share
subdirectory seems to be vosk-models
, not vosk/models
:
$ pacman -Ql vosk-api-bin
vosk-api-bin /usr/
vosk-api-bin /usr/include/
vosk-api-bin /usr/include/vosk_api.h
vosk-api-bin /usr/lib/
vosk-api-bin /usr/lib/libvosk.so
vosk-api-bin /usr/local/
vosk-api-bin /usr/local/share/
vosk-api-bin /usr/local/share/vosk-models/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/README
vosk-api-bin /usr/local/share/vosk-models/small-en-us/am/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/am/final.mdl
vosk-api-bin /usr/local/share/vosk-models/small-en-us/conf/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/conf/mfcc.conf
vosk-api-bin /usr/local/share/vosk-models/small-en-us/conf/model.conf
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/Gr.fst
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/HCLr.fst
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/disambig_tid.int
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/phones/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/graph/phones/word_boundary.int
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/final.dubm
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/final.ie
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/final.mat
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/global_cmvn.stats
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/online_cmvn.conf
vosk-api-bin /usr/local/share/vosk-models/small-en-us/ivector/splice.conf
But that may be a packaging issue in the vosk-api-bin
package, as vosk-api
does use vosk/models
.
Just a thought/convenience that might be nice to have.
I decided to run bandit to check for potential security issues are proposed in #10.
The only thing bandit found was the usage of subprocess to run ffmpeg, although it rates the severity as "low" since subprocess.Popen is already used in fairly safe way (i.e. not spawning a command shell). So the only issue i can really see here is that the ffmpeg command accepts an arbitrary input file.
One way to "fix" the arbitrary input file issue would be to avoid running ffmpeg and instead have the user provide a valid file format, but I assume we don't want that.
Then the other thing to do is just general safety measures, like sandboxing, limiting process resources and limiting the input file to well known formats. Not sure if any of these are "worth it" though, that would be up for discussion.
Test results:
>> Issue: [B404:blacklist] Consider possible security implications associated with the subprocess module.
Severity: Low Confidence: High
CWE: CWE-78 (https://cwe.mitre.org/data/definitions/78.html)
Location: voskcli/transcribe.py:21:0
More Info: https://bandit.readthedocs.io/en/1.7.4/blacklists/blacklist_imports.html#b404-import-subprocess
20 import os
21 import subprocess
22 import json
--------------------------------------------------
>> Issue: [B603:subprocess_without_shell_equals_true] subprocess call - check for execution of untrusted input.
Severity: Low Confidence: High
CWE: CWE-78 (https://cwe.mitre.org/data/definitions/78.html)
Location: voskcli/transcribe.py:189:14
More Info: https://bandit.readthedocs.io/en/1.7.4/plugins/b603_subprocess_without_shell_equals_true.html
188 '-ar', str(sample_rate), '-ac', '1', '-f', 's16le', '-']
189 process = subprocess.Popen(command, stdout=subprocess.PIPE)
190
--------------------------------------------------
Code scanned:
Total lines of code: 195
Total lines skipped (#nosec): 0
Run metrics:
Total issues (by severity):
Undefined: 0
Low: 2
Medium: 0
High: 0
Total issues (by confidence):
Undefined: 0
Low: 0
Medium: 0
High: 2
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.