Giter Club home page Giter Club logo

ecoz2-sr4x's Introduction

VQ/HMM based SR4X speech classification

Commands described below refer to speech wave files from the SR4X v1.2 corpus sample. With a slight file reorganization to facilitate the use of the ECOZ2 tools, the files corresponding to the word <className> are located (in a separate space) under ../SR4X/speech/<className>/.

Status

Satisfactory but basic testing complete. Minimal model tuning. Further updates unlikely. Main goal has been to capture initial tests on a revision of the VQ/HMM code I wrote many years ago.

Note: These notes are pretty terse in general, but the reported confusion matrices and ranked candidate tables can probably at least provide a good sense of the results.

Endpoint detection

sgn.endp ../SR4X/speech/*/*[0-9].wav

The files from this command get also generated under the separate space mentioned above. They go to the same original directory and same name as prefix, and info about the extracted interval <path-to-name>__S<start>_L<length>$.wav as suffix, where <start> is the start index of the detection wrt to input signal, and <length> is the size of the detection.

Predictor files

Using the endpoint-detected files above, the following starts populating the ./data/ subdirectory here with corresponding "predictor" files:

lpc -P 12 -W 45 -O 15 -m 10 -s 0.9 ../SR4X/speech/*/*$.wav
  • -P 12: 12-order prediction;
  • -W 45: 45-ms analysis window size;
  • -O 15: 15-ms window offset;
  • -m 10: only consider classes with at least 10 signal files;
  • -s 0.9: to split the set of files into approximately 90% for a training subset and 10% for a testing subset. With this option the resulting predictor files get generated under data/predictors/TRAIN/ and data/predictors/TEST/ respectively.

Note

None of the following files are committed to version control in this repository:

  • The WAV files indicated above
  • ECOZ2 executables (lpc, vq.learn, hmm.classify, etc.)
  • Binary files generated by the commands in the exercises (only the various "report" files (.rpt) created during codebook and HMM model training.

The exercises

See:

  • vq.md
  • hmm.md

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.