Comments (11)
Sure, but you have to modify recognizer C++ source.
from vosk-api.
Does anyone share the idea of returning nbest outputs?
from vosk-api.
@YunzhaoLu you can check the linked PR to get an idea of how to do it. It might not be the final version though.
from vosk-api.
@sskorol Thank you.
from vosk-api.
We have this now with SetMaxAlternatives method.
from vosk-api.
Hi,
did anyone play with mixing this SetMaxAlternative feature and a next word model predictor to increase accuracy? If yes, any feedback on the soundness of that approach?
E.g. I was transcribing some audio about space exploration in english (although the speaker has a south african accent, guess who that can be...) with vosk api, which is doing a pretty good work. Simply, from time to time for instance it would detect "launderette" instead of "launchpad", and I was thinking that certainly with some context awareness the transcription should be able to prefer "launchpad" as a higher probability than "launderette".
from vosk-api.
Simply, from time to time for instance it would detect "launderette" instead of "launchpad"
You can adjust the language model with https://alphacephei.com/vosk/lm
from vosk-api.
You mean that the model I am using may not contain "launchpad" in its vocabulary and I should add it? I am using the 1.8GB eng model.
from vosk-api.
The word launchpad is already there, it might just have suboptimal probability.
from vosk-api.
Yes so hence why I was looking to use some context awarness model (next word predictor , using BART for example) to pick up lower proba words using a model which tracks the ttranscript domain (space here).
I am not clear what I can do with the language model here though. I understand that the language model tweaking is useful to add vocabulary for abbreviations or proper nouns for example, but in my scenario I am not sure what I can do really. Maybe I am missing something on what LM can bring though.
from vosk-api.
Maybe I am missing something on what LM can bring though.
Yes. LM is exactly about context awareness.
from vosk-api.
Related Issues (20)
- Not clear how to update recognizer vocabulary in runtime HOT 1
- Where can I find more chinese models HOT 3
- Partial results on the gpu batch recognizer HOT 5
- using this stuff for a newbie HOT 1
- Is it possible to detect the spoken language? HOT 2
- phones.txt in vosk-model-en-us-0.22-lgraph is smaller than Kaldi Librispeech
- Is there an example of a Java client connecting to the webRtc server HOT 6
- problem with install to Raspberry 5 HOT 1
- Vosk Dependency did not integerate in android kotin why
- .NET - x86 support HOT 4
- cannot importing libvosk.dll HOT 6
- When used on Android, if the speaker is far away from the device, it will not be recognized accurately HOT 1
- Novice in speech recognition
- Traing Korean Model (Real-time speech-to-text (STT) model) HOT 2
- remove vocabulary from the model HOT 1
- where to get the dictionary of vosk-model-en-us-0.22-lgraph HOT 8
- Segmentation fault on recognizer.close() in Java HOT 1
- Sigsegv on loading the model [java and kotlin jvm] HOT 3
- how to fine tuning Uzbek language dataset for Vosk model ? HOT 1
- Vidoe tutorial for fine tuing Vosk model for any language . HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vosk-api.