i googled a code and integrate into example, but after speak "How are you?" in 3 sec

<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

how to record as raw file for decode? about pocketsphinx-python HOT 6 CLOSED

cmusphinx commented on August 19, 2024

how to record as raw file for decode?

from pocketsphinx-python.

Comments (6)

hoyeunglee commented on August 19, 2024

replace 
C:\Users\martlee2\Downloads\pocketsphinx\model\en-us\cmudict-en-us.dict
with
https://github.com/cmusphinx/cmudict/blob/master/cmudict.dict

and

replace
C:\Users\martlee2\Downloads\pocketsphinx\model\en-us\en-us.lm.bin
with
https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English/en-70k-0.2.lm.gz/download
after unzip and rename from en-70k-0.2.lm to en-us.lm.bin

but loading very slow and same as before, not recognize "How are you"

ERROR: "dict.c", line 195: Line 134989: Phone 'IY1' is mising in the acoustic model; word 'zwieg' ignored
ERROR: "dict.c", line 195: Line 134990: Phone 'IH1' is mising in the acoustic model; word 'zwilling' ignored
ERROR: "dict.c", line 195: Line 134991: Phone 'AH0' is mising in the acoustic model; word 'zwolinski' ignored
ERROR: "dict.c", line 195: Line 134992: Phone 'IH1' is mising in the acoustic model; word 'zycad' ignored
ERROR: "dict.c", line 195: Line 134993: Phone 'AY1' is mising in the acoustic model; word 'zych' ignored
ERROR: "dict.c", line 195: Line 134994: Phone 'IH1' is mising in the acoustic model; word 'zycher' ignored
ERROR: "dict.c", line 195: Line 134995: Phone 'AY1' is mising in the acoustic model; word 'zydeco' ignored
ERROR: "dict.c", line 195: Line 134996: Phone 'IH1' is mising in the acoustic model; word 'zygmunt' ignored
ERROR: "dict.c", line 195: Line 134997: Phone 'AY1' is mising in the acoustic model; word 'zygote' ignored
ERROR: "dict.c", line 195: Line 134998: Phone 'IH1' is mising in the acoustic model; word 'zyla' ignored
ERROR: "dict.c", line 195: Line 134999: Phone 'IH1' is mising in the acoustic model; word 'zylka' ignored
ERROR: "dict.c", line 195: Line 135000: Phone 'IH1' is mising in the acoustic model; word 'zylstra' ignored
ERROR: "dict.c", line 195: Line 135001: Phone 'AY1' is mising in the acoustic model; word 'zyman' ignored
ERROR: "dict.c", line 195: Line 135002: Phone 'IH1' is mising in the acoustic model; word 'zynda' ignored
ERROR: "dict.c", line 195: Line 135003: Phone 'AY1' is mising in the acoustic model; word 'zysk' ignored
ERROR: "dict.c", line 195: Line 135004: Phone 'IH0' is mising in the acoustic model; word 'zyskowski' ignored
ERROR: "dict.c", line 195: Line 135005: Phone 'UW1' is mising in the acoustic model; word 'zyuganov' ignored
ERROR: "dict.c", line 195: Line 135006: Phone 'UW1' is mising in the acoustic model; word 'zyuganov(2)' ignored
ERROR: "dict.c", line 195: Line 135007: Phone 'UW1' is mising in the acoustic model; word 'zyuganov's' ignored
ERROR: "dict.c", line 195: Line 135008: Phone 'UW1' is mising in the acoustic model; word 'zyuganov's(2)' ignored
ERROR: "dict.c", line 195: Line 135009: Phone 'IH0' is mising in the acoustic model; word 'zywicki' ignored
INFO: dict.c(213): Dictionary size 8, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(336): 8 words read
INFO: dict.c(358): Reading filler dictionary: C:\Users\martlee2\Downloads\pocketsphinx\model\en-us/en-us/noisedict
INFO: dict.c(213): Dictionary size 13, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 5 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 21336 bytes (20 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 21336 bytes (20 KiB) for single-phone word triphones
INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
INFO: ngram_model_trie.c(365): Header doesn't match
INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
INFO: ngram_model_trie.c(193): LM of order 3
INFO: ngram_model_trie.c(195): #1-grams: 72547
INFO: ngram_model_trie.c(195): #2-grams: 9704821
INFO: ngram_model_trie.c(195): #3-grams: 12264838
INFO: lm_trie.c(474): Training quantizer
INFO: lm_trie.c(482): Building LM trie
INFO: ngram_search_fwdtree.c(74): Initializing search tree
INFO: ngram_search_fwdtree.c(101): 3 unique initial diphones
INFO: ngram_search_fwdtree.c(186): Creating search channels
INFO: ngram_search_fwdtree.c(323): Max nonroot chan increased to 128
INFO: ngram_search_fwdtree.c(333): Created 2 root, 0 non-root channels, 8 single-phone words
INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: ngram_search_fwdtree.c(429): TOTAL fwdtree 0.00 CPU -nan(ind) xRT
INFO: ngram_search_fwdtree.c(432): TOTAL fwdtree 0.00 wall -nan(ind) xRT
INFO: ngram_search_fwdflat.c(176): TOTAL fwdflat 0.00 CPU -nan(ind) xRT
INFO: ngram_search_fwdflat.c(179): TOTAL fwdflat 0.00 wall -nan(ind) xRT
INFO: ngram_search.c(303): TOTAL bestpath 0.00 CPU -nan(ind) xRT
INFO: ngram_search.c(306): TOTAL bestpath 0.00 wall -nan(ind) xRT
INFO: cmn_live.c(120): Update from < 40.00  3.00 -1.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00 >
INFO: cmn_live.c(138): Update to   < 79.51 -17.97 -4.18 -4.96 -3.30 -2.92 -1.05 -0.39 -0.53 -1.86  0.88 -1.45 -0.71 >
INFO: ngram_search_fwdtree.c(1550):      524 words recognized (5/fr)
INFO: ngram_search_fwdtree.c(1552):     7630 senones evaluated (76/fr)
INFO: ngram_search_fwdtree.c(1556):     9509 channels searched (95/fr), 192 1st, 9317 last
INFO: ngram_search_fwdtree.c(1559):      854 words for which last channels evaluated (8/fr)
INFO: ngram_search_fwdtree.c(1561):      342 candidate words for entering last phone (3/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 0.03 CPU 0.031 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 0.05 wall 0.052 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 5 words
INFO: ngram_search_fwdflat.c(948):      465 words recognized (5/fr)
INFO: ngram_search_fwdflat.c(950):     6855 senones evaluated (69/fr)
INFO: ngram_search_fwdflat.c(952):     9699 channels searched (96/fr)
INFO: ngram_search_fwdflat.c(954):      771 words searched (7/fr)
INFO: ngram_search_fwdflat.c(957):      226 word transitions (2/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.02 CPU 0.016 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.02 wall 0.016 xRT
INFO: ngram_search.c(1250): lattice start node <s>.0 end node </s>.33
INFO: ngram_search.c(1276): Eliminated 1 nodes before end node
INFO: ngram_search.c(1381): Lattice has 130 nodes, 79 links
INFO: ps_lattice.c(1380): Bestpath score: -1414
INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(</s>:33:98) = -141601
INFO: ps_lattice.c(1441): Joint P(O,S) = -154219 P(S|O) = -12618
INFO: ngram_search.c(1027): bestpath 0.00 CPU 0.000 xRT
INFO: ngram_search.c(1030): bestpath 0.00 wall 0.000 xRT
('Best hypothesis segments: ', ['<s>', '</s>'])
INFO: ngram_search_fwdtree.c(429): TOTAL fwdtree 0.03 CPU 0.032 xRT
INFO: ngram_search_fwdtree.c(432): TOTAL fwdtree 0.05 wall 0.053 xRT
INFO: ngram_search_fwdflat.c(176): TOTAL fwdflat 0.02 CPU 0.016 xRT
INFO: ngram_search_fwdflat.c(179): TOTAL fwdflat 0.02 wall 0.016 xRT
INFO: ngram_search.c(303): TOTAL bestpath 0.00 CPU 0.000 xRT
INFO: ngram_search.c(306): TOTAL bestpath 0.00 wall 0.000 xRT

from pocketsphinx-python.

nshmyrev commented on August 19, 2024

is below code the correct way to record as raw file in python 2.7?

Yes

replace
C:\Users\martlee2\Downloads\pocketsphinx\model\en-us\cmudict-en-us.dict
with
https://github.com/cmusphinx/cmudict/blob/master/cmudict.dict

This is a bad idea

The result return ('Best hypothesis segments: ', ['<s>', 'huh'])

Share the file to get help on this issue

from pocketsphinx-python.

hoyeunglee commented on August 19, 2024

https://drive.google.com/file/d/0Bxs_ao6uuBDUb2JhWXVnU2JGQjg/view?usp=sharing
https://drive.google.com/file/d/0Bxs_ao6uuBDUbE56Y19zSHA1SjA/view?usp=sharing
https://drive.google.com/file/d/0Bxs_ao6uuBDUQ2s5NXd3cU9XY28/view?usp=sharing
https://drive.google.com/file/d/0Bxs_ao6uuBDUOXBzalAyNFJ2c0U/view?usp=sharing

Here are the files

tried

https://github.com/cmusphinx/pocketsphinx

and
https://github.com/hansonrobotics/pocketsphinx

and

https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English/

and i find that file size are the same for model file, then i choose one of them to use

from pocketsphinx-python.

nshmyrev commented on August 19, 2024

I mean the audio file you created.

As for different versions you need to strictly follow the official tutorial, we can not help you with the bugs in other forks.

from pocketsphinx-python.

hoyeunglee commented on August 19, 2024

https://drive.google.com/file/d/0Bxs_ao6uuBDUZUJUc0JGMlV3aXM/view?usp=sharing

pocketsphinx\test\data

this path also has it

from pocketsphinx-python.

hoyeunglee commented on August 19, 2024

i do not know whether the original files contain all words in dictionary and model file

so i use the latest file from
https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English/
and
https://github.com/cmusphinx/cmudict

and replace them as mentioned before

from pocketsphinx-python.

how to record as raw file for decode? about pocketsphinx-python HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent