Giter Club home page Giter Club logo

indonesian-asr's Introduction

Indonesian Automatic Speech Recognition

cd training

  1. Create MFCC files

python feature_extraction.py dataset/train

  1. Create Monophone Model
  2. Prototype
`python hmm_prototype.py files` (SEKIP AJA GAPAPA SIH)
  1. Wordlist
`HDMan -m -w wordlist/wlist -n monophones1 -l dlog dict wordlist/indonesian.lex`

Edit "dict" by adding
SENT-END    sil
SENT-START  sil
silence     sil

at the correct position (remain sorted)

Jalankan `python mlf.py`

Create following edit script "mkphones0.led" containing:
EX
IS sil sil
DE sp

Window: `HLEd -l * -d wordlist/dict -i wordlist/phones0.mlf wordlist/mkphones0.led wordlist/words_sanitize.mlf` (INI IYA)

Ubuntu: `HLEd -l '*' -d wordlist/dict -i wordlist/phones0.mlf wordlist/mkphones0.led wordlist/words_sanitize.mlf` (INI IYA)

(disini bakal kobam dan ketahuan mana aja kata-kata yang ga ada di wlist dan dict, jadi harus crosscheck sampe bener)
(ERROR [+6550]  LoadHTKLabels: Junk at end of HTK transcription -> jangan lupa hapus spasi doang 1 line, hapus dengan regex `^\n` )
(ERROR [+6550]  LoadHTKList: Label Name Expected -> ini karena ada yang angka)
(ERROR [+1232]  NumParts: Cannot find word %s in dictionary -> mangat nguli)

beres semua error diatas,
`python sanitizer.py`
buat ngebersihin si mlf dan scp dari suara yang samsek ga ada di dict
bakal ngeluarin scp dan mlf yang _sanitize
  1. HMM0-Init
`mkdir hmm0`

`HCompV -C config/conf-train -f 0.01 -m -S files/train_sanitize.scp -M hmm0 files/proto.hmm`

Create monophones0 dengan menggunakan monophones1 tanpa menggunakan entri 'sp'
Lalu bikin file hmmdefs dan macros
  1. Create Model

    1. HMM-1

    mkdir hmm1

    HERest -A -D -T 1 -C config/conf-train -I wordlist/phones0.mlf -t 250.0 150.0 1000.0 -S files/train_sanitize.scp -H hmm0/macros -H hmm0/hmmdefs -M hmm1 wordlist/monophones0

    1. HMM-2

    mkdir hmm2

    HERest -A -D -T 1 -C config/conf-train -I wordlist/phones0.mlf -t 250.0 150.0 1000.0 -S files/train_sanitize.scp -H hmm1/macros -H hmm1/hmmdefs -M hmm2 wordlist/monophones0

    1. HMM-3

    mkdir hmm3

    HERest -A -D -T 1 -C config/conf-train -I wordlist/phones0.mlf -t 250.0 150.0 1000.0 -S files/train_sanitize.scp -H hmm2/macros -H hmm2/hmmdefs -M hmm3 wordlist/monophones0

    1. HMM-4

    mkdir hmm4

    HERest -A -D -T 1 -C config/conf-train -I wordlist/phones0.mlf -t 250.0 150.0 1000.0 -S files/train_sanitize.scp -H hmm3/macros -H hmm3/hmmdefs -M hmm4 wordlist/monophones0

    1. HMM-5

    mkdir hmm5

    HERest -A -D -T 1 -C config/conf-train -I wordlist/phones0.mlf -t 250.0 150.0 1000.0 -S files/train_sanitize.scp -H hmm4/macros -H hmm4/hmmdefs -M hmm5 wordlist/monophones0

    1. HMM-6

    mkdir hmm6

    HERest -A -D -T 1 -C config/conf-train -I wordlist/phones0.mlf -t 250.0 150.0 1000.0 -S files/train_sanitize.scp -H hmm5/macros -H hmm5/hmmdefs -M hmm6 wordlist/monophones0

    1. HMM-7

    mkdir hmm7

    HERest -A -D -T 1 -C config/conf-train -I wordlist/phones0.mlf -t 250.0 150.0 1000.0 -S files/train_sanitize.scp -H hmm6/macros -H hmm6/hmmdefs -M hmm7 wordlist/monophones0

5. Silence

   1. Copy directory hmm7 to hmm8
      
      `xcopy hmm7 hmm8`
         
   2. Copy and paste the “sil” model and rename the new one “sp”(don't delete your old "sil" model, you will need it - just make a copy of it)
   
      Remove state 2 and 4 from new “sp” model (i.e. keep 'centre state' of old “sil” model in new “sp” model)
      
      change <NUMSTATES> to 3
      
      change <STATE> to 2
      
      change <TRANSP> to 3
      
      change matrix in <TRANSP> to 3 by 3 array
      
      change numbers in matrix as follows:
       `0.0 1.0 0.0
       0.0 0.9 0.1
       0.0 0.0 0.0`
      
   2. HHed
       Create the "sil.hed" script containing:
        `AT 2 4 0.2 {sil.transP}
        AT 4 2 0.2 {sil.transP}
        AT 1 3 0.3 {sp.transP}
        TI silst {sil.state[3],sp.state[2]}`
   
      `mkdir hmm9`
      
      `HHEd -H hmm8/macros -H hmm8/hmmdefs -M hmm9 wordlist/sil.hed wordlist/monophones1`
      
   3. HRest 2x
   
      Windows: `HLEd -l * -d wordlist/dict -i wordlist/phones1.mlf wordlist/mkphones1.led wordlist/words_sanitize.mlf`
      
      Ubuntu: `HLEd -l '*' -d wordlist/dict -i wordlist/phones1.mlf wordlist/mkphones1.led wordlist/words_sanitize.mlf`
      
      `mkdir hmm10`
      
      `HERest -A -D -T 1 -C config/conf-train  -I wordlist/phones1.mlf -t 250.0 150.0 3000.0 -S files/train_sanitize.scp -H hmm9/macros -H  hmm9/hmmdefs -M hmm10 wordlist/monophones1`
      
      `mkdir hmm11`
      
      `HERest -A -D -T 1 -C config/conf-train  -I wordlist/phones1.mlf -t 250.0 150.0 3000.0 -S files/train_sanitize.scp -H hmm10/macros -H  hmm10/hmmdefs -M hmm11 wordlist/monophones1`
      
6. Realigning the data
   
   1. HVite
    
      Bagian ini kata voxforge `1000.0` tapi kalau segitu maupun `3000.0` ada label yang hilang, jadi aku ubah `5000.0`.
      
      Windows: `HVite -A -D -T 1 -l * -o SWT -b SENT-END -C config/conf-train -H hmm11/macros -H hmm11/hmmdefs -i wordlist/aligned.mlf -m -t 250.0 150.0 5000.0 -y lab -a -I wordlist/words_sanitize.mlf -S files/train_sanitize.scp wordlist/dict wordlist/monophones1> HVite_log`
      
      Ubuntu: `HVite -A -D -T 1 -l '*' -o SWT -b SENT-END -C config/conf-train -H hmm11/macros -H hmm11/hmmdefs -i wordlist/aligned.mlf -m -t 250.0 150.0 5000.0 -y lab -a -I wordlist/words_sanitize.mlf -S files/train_sanitize.scp wordlist/dict wordlist/monophones1> HVite_log`

   2. HRest 999999x :'(
   
      `mkdir hmm12`
      
      `HERest -A -D -T 1 -C config/conf-train  -I wordlist/aligned.mlf -t 250.0 150.0 3000.0 -S files/train_sanitize.scp -H hmm11/macros -H  hmm11/hmmdefs -M hmm12 wordlist/monophones1`
      
      `mkdir hmm13`
      
      `HERest -A -D -T 1 -C config/conf-train  -I wordlist/aligned.mlf -t 250.0 150.0 3000.0 -S files/train_sanitize.scp -H hmm12/macros -H  hmm12/hmmdefs -M hmm13 wordlist/monophones1`
      
      `mkdir hmm14`
      
      `HERest -A -D -T 1 -C config/conf-train  -I wordlist/aligned.mlf -t 250.0 150.0 3000.0 -S files/train_sanitize.scp -H hmm13/macros -H  hmm13/hmmdefs -M hmm14 wordlist/monophones1`
      
      `mkdir hmm15`
      
      `HERest -A -D -T 1 -C config/conf-train  -I wordlist/aligned.mlf -t 250.0 150.0 3000.0 -S files/train_sanitize.scp -H hmm14/macros -H  hmm14/hmmdefs -M hmm15 wordlist/monophones1`
      
      `mkdir hmm16`
      
      `HERest -A -D -T 1 -C config/conf-train  -I wordlist/aligned.mlf -t 250.0 150.0 3000.0 -S files/train_sanitize.scp -H hmm15/macros -H  hmm15/hmmdefs -M hmm16 wordlist/monophones1`
      
      `mkdir hmm17`
      
      `HERest -A -D -T 1 -C config/conf-train  -I wordlist/aligned.mlf -t 250.0 150.0 3000.0 -S files/train_sanitize.scp -H hmm16/macros -H  hmm16/hmmdefs -M hmm17 wordlist/monophones1`
      
      `mkdir hmm18`
      
      `HERest -A -D -T 1 -C config/conf-train  -I wordlist/aligned.mlf -t 250.0 150.0 3000.0 -S files/train_sanitize.scp -H hmm17/macros -H  hmm17/hmmdefs -M hmm18 wordlist/monophones1`

cd ..

mkdir decoder

cd decoder (indonesian-asr/decoder)

7. Recognizer evaluation

    # Tanpa LM GRAM

    HBuild wordlist/wlist_sanitize result/network.w

    HVite -H hmm18/macros -H hmm18/hmmdefs -S files/train_sanitize.scp -l '*' -i result/recout_nogram.mlf -p 0.0 -s 5.0 -w result/network.w wordlist/dict wordlist/monophones1

    HResults -I wordlist/words_sanitize.mlf wordlist/monophones1 result/recout_nogram.mlf

    # Dengan LM GRAM

    HBuild wordlist/wlist_sanitize result/network.w -n lm/lm.arpa

    HVite -H hmm18/macros -H hmm18/hmmdefs -S files/train_sanitize.scp -l '*' -i result/recout_gram.mlf -p 0.0 -s 5.0 -w result/network.w wordlist/dict wordlist/monophones1

    HResults -I wordlist/words_sanitize.mlf wordlist/monophones1 result/recout_gram.mlf





8. Decoder

   1. Install Julius (http://julius.osdn.jp/en_index.php)
   2. Download SLRIM (http://www.speech.sri.com/projects/srilm/)
   3. Buat LM:
   4. Buat AM: `mkbinhmm -htkconf ../training/config/conf-train ../training/hmm12/hmmdefs julius.am`

    Install HDecode: `nmake /f htk_hdecode_nt.mkf all`
   
    `mkdir result`
    
    `HDecode -A -D -T 1 -H hmm13/macros -H hmm13/hmmdefs -C config/conf-test -S files/test_sanitize.scp -l * -i result/recout.mlf -w lm/lm.arpa -p 0.0 -s 5.0 wordlist/dict wordlist/monophones1`
    
    Windows: `HVite -H hmm13/macros -H hmm13/hmmdefs -S files/test_sanitize.scp -l * -i result/recout.mlf -w ../lm/lm.arpa -p 0.0 -s 5.0 wordlist/dict wordlist/monophones1`
    
    
    
    8. STEP 9
    
    Windows: HLEd -A -D -T 1 -n wordlist/mktri.led -l * -i wordlist/wintri.mlf wordlist/mktri.led wordlist/aligned.mlf
    
    HHEd -A -D -T 1 -H hmm18/macros -H hmm18/hmmdefs -M hmm19 wordlistmktri.hed wordlist/monophones1 

Web

Make sure your computer have node js

Run the program

  1. Make sure you are in web folder using cd web
  2. Open node.js terminal (in Windows) or normal terminal (in Linux)
  3. Type node.js
  4. Open your browser and type localhost:8800
  5. Make sure you allow microphone in the browser
  6. Click button Start Recording for record and Stop Recording for stop and save the file in web/demo.wav
  7. Play the sound demo.wav with your application

Dependency

  1. npm install binaryjs
  2. npm install express
  3. npm install fs
  4. npm install jade
  5. npm install wav
  6. npm install recordrtc
  7. npm install child_process

cobaexec.js

run using node cobaexec.js for changing the output, just take the stdout variable inside the function

indonesian-asr's People

Contributors

asanilta avatar feryandi avatar fitrakun avatar jessicahandayani avatar tifaniwarnita avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

indonesian-asr's Issues

dataset

how can I get the traning data?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.