Comments (9)
This is because for some reason -remove_silence and -remove_noise are enabled by default in the pocketsphinx configuration. This makes it drop silence and noise frames, so the frame indices only correspond to actual speech. It speeds up decoding but confuses everybody! You can fix it like this:
cfg = pocketsphinx.Decoder.get_default_config()
cfg.set_boolean('-remove_silence', False)
cfg.set_boolean('-remove_noise', False)
decoder = pocketsphinx.Decoder(cfg)
etc.
from pocketsphinx-python.
Hi @dhdaines
remove_noise should be almost always enabled because models are trained with noise removal feature.
remove_silence is also a very good thing because it allows to properly compute CMN estimate, thus much better accuracy.
The current proposal is to use continuous processing, then timing will be correct. The implementation of continuous processing is not great though.
from pocketsphinx-python.
If you use continuous processing, doesn't this prevent you from getting whole-utterance CMN, though?
from pocketsphinx-python.
I suppose that is only relevant if you're doing offline recognition, of course...
from pocketsphinx-python.
Whole utterance cmn is also ok, and still CMN estimation is best without silence frames which could be really long in real life (2 seconds of silence around short command) unlike in common ASR databases. Thats what remove_silence is doing and this is why it is enabled by default.
Timing and whole-utterance processing are not very easy, it is correct.
from pocketsphinx-python.
Hi, is there a way to get the starting and ending time-step of each phoneme. I am still a little bit confused by that.
from pocketsphinx-python.
from pocketsphinx-python.
Hi, There isn't a particularly easy way to do that at the moment. The search is word-based (otherwise it would be horribly slow) - if you use allphone decoding you will get phone segmentations but the phoneme accuracy isn't very good. The intention is that state align search can be used as a second pass to get phone alignments. (in fact, it will give them to you, but this involves writing code) I am trying to find some time to implement this.
…
Thank you for your reply! I still wondering does continuous processing approximate the time steps?
from pocketsphinx-python.
This module is obsolete; python bindings are now in pocketsphinx
from pocketsphinx-python.
Related Issues (20)
- train Indian English g2p model with seq2seq
- Please clarify the way of importing pocketshinx and sphinxbase modules in example.py HOT 1
- Basic usage example inaccurate? HOT 1
- How to print phoneme sequence? HOT 1
- how do i trained tidigits acoustic model with my own audiofiles? missing mixture_weights file in the acoustic tidigits model. HOT 1
- ImportError: cannot import name 'LiveSpeech' from 'pocketsphinx' HOT 3
- new_Decoder returned -1 when Unicode character on project path HOT 1
- Decoding with other acoustic models HOT 1
- how to use microphone ? HOT 2
- ERROR: "acmod.c", line 79: does not contain acoustic model definition 'mdef' HOT 1
- pip install fails HOT 1
- Can I use with PySoundFile? HOT 3
- pocketsphinx phrase in speech HOT 2
- This repository is 120 commits behind bambocher HOT 1
- Not able to install pocketsphinx HOT 2
- Pipenv install fails HOT 2
- Understand 'no_search' and 'full_utt' parameters HOT 1
- Should we be now using this branch? HOT 1
- testsuite failure on i386 HOT 1
- NameError: global name 'Ad' is not defined pocketsphinx HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pocketsphinx-python.