Giter Club home page Giter Club logo

opendcd's Introduction

OpenDcd - An Open Source WFST based Speech Recognition Decoder

OpenDcd a lightweight and portable WFST based speech decoding toolkit written in C++. OpenDcd provides a set of tools for decoding, cascade construction and hypothesis post- processing. The focus of the toolkit is to provide a foundation for research into new decoding methods that can be deployed. Through the use of C++ templates the core decoder can be configured and extended in many ways. For example selecting different on-the-fly composition or lattice generation strategies. The core engine has detailed profiling, logging and analysis methods that make it highly for deployement in production systems. The toolkit makes used of OpenFst for representing and manipulating the models. It is distributed as an open source project with an Apache Licence.

For more information see the main documentation site, and the tutorial for installing the OpenDcd and decoding using the Librispeech corpus and models from kaldi-asr.og.

Quick Installation Guide

    export KALDI_ROOT=/path/to/kaldi-trunk/
    git clone https://github.com/opendcd/opendcd.git
    cd opendcd/3rdparty
    make
    cd ../src
    make

Kaldi Conversion Quick Start

For Kaldi model converion and decoding a working Kaldi installation and set of acoustic and language models and features from generated from a Kaldi egs/s5 script are required. The following example is based on the output of Kaldi WSJ training run.

Graph construction, the scripts directory contains The Kaldi language directory, we re-use the existing Kaldi lexicon and LM.

    cd $OPENDCD/scripts
    export KALDI_ROOT=/home/opendcd/tools/kaldi-trunk
    ./makeclevel.sh \
    $KALDI_ROOT/egs/wsj/s5/data/lang_test_bg_5k \
    $KALDI_ROOT/egs/wsj/s5/exp/tri2a \
    $KALDI_ROOT/egs/wsj/s5/exp/ocd_tri2a \
    $KALDI_ROOT

See egs directory contains example script for showing how to convert a Kaldi WSJ setup

Brief Overview

The first release includes the following features:

  • Standalone lightweight decoder core
  • Kaldi file format compatible or ptionally build against Kaldi
  • Post-processing tools
  • OpenFst and Kaldi Interop Tools

Decoder:

  • Customizable transition model for custom user and transition models
  • Direct decoding on different weight semiring
  • On-the-fly decoding using lookahead composition
  • Lattice generation
  • Switchable STL implementations. Use different implementations such EASTL or RDESTL, or mix optimized containers such as Google sparse hash.
  • Powerful registration mechanism for adding user defined acoustic models and or lattice generation strategies

Cascde construction:

  • Script to efficiently build and convert models from a Kaldi lang directory

Results post-processing:

  • farfilter Apply the command to every in FST in the FAR archive
  • latticetofar Convert Kaldi Table to OpenFst FAR archive
  • fartolattice Convert an OpenFst FAR archive to Kaldi Table

Kaldi Interoperability:

  • Write results in Kaldi Lattice table format
  • More information on optionally building against Kaldi
  • Convert Kaldi tree to optimized decoding structure

More Information

  • A getting start guide for running OpenDcd on Ec2 using the Librispeech models
  • Ongoing introdutory slides can be found here. These are updated infrequently.

We request acknowledgement in any publications that make use of this software cites the below paper.

@Article{dixon:2015dcd,
     author    = "Paul R. Dixon",
     title     = "{OpenDcd: An Open Source WFST Decoding Toolkit}",
     year      = "2014"
}

opendcd's People

Contributors

edobashira avatar opendcd avatar dixonpaul avatar qtdaniel avatar danielrenshaw avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.