Hi, did you add word level language model for beam search? Currently

just checking if the words exist would be the easiest way to go:<

Language model at word level about ctcdecoder HOT 4 CLOSED

githubharald commented on May 21, 2024

Language model at word level

from ctcdecoder.

Comments (4)

githubharald commented on May 21, 2024 1

just checking if the words exist would be the easiest way to go:
A. you could check if the words in a beam exist in your dictionary. Each time a labelling gets extended by a whitespace in function calcExtPr, you could check if the last word exists, if yes, assign a probability of 1 and 0 otherwise.
B. or you could build a dictionary of prefixes of the dictionary words (e.g. Hello -> H, He, Hel, ...), by using a prefix tree. Then you know which beams can be extended by which characters.
using word-level bigram LM is not that easy. You can only score neighbouring words by a bigram after both words have been fully added to the beam. But you could give it a try. Score the two last words of a beam as soon it is possible. This would at least remove beams that represent nonsense from a LM point of view, even if this scoring happens a bit late. I think a clever combination of word-level LM and a prefix tree could give good results and would be fast (reduce number of beams).

from ctcdecoder.

marcoleewow commented on May 21, 2024

I have done 1.A together with long words penalty, but there is no word bi-gram level prior knowledge to this method which means it is only an autocorrect.

Example: "milk the cous" are all words in the dictionary but it does not make sense, whereas the true label we want is "milk the cows".

For 2, I have tried giving bi-gram scores whenever I see a space label, but then it will push the beam out of beam width and what I get is a long single word a lot of time.

Currently I am reading on WFSTpdf and trying to implement a CTC decoder using WSFT so that I can include bi-gram word level, have you tried these methods?

from ctcdecoder.

githubharald commented on May 21, 2024

no, I haven't tried WFST yet.

from ctcdecoder.

githubharald commented on May 21, 2024

I've implemented an algorithm which uses beam search on word-level (dictionary, unigrams/bigrams) and which runs faster than token passing: https://github.com/githubharald/CTCWordBeamSearch

from ctcdecoder.

Recommend Projects

Language model at word level about ctcdecoder HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent