Giter Club home page Giter Club logo

asrdys's Introduction

ASRdys

Description

ASRdys is a Kaldi recipe to build an ASR for speakers with dysarthria. The recipe works on the Torgo database [1] so you need to obtain this data first and include its location into the path.sh file. Several models are used in the pipeline implemented in run.sh/runMultipleSubtests.sh.

Usage

To train several models on all the speakers except one for testing do:

  bash ./run.sh <test_speaker>

where <test_speaker> is one of the 15 speakers in the database.

To train several models on 15 different configurations taking a different test speaker at a time do:

  bash ./runAllTests.sh

To train several models and see the results in different partions of the test set define those partitions with torgo_data_prep_multiple_tests.sh (if different to the current ones), initialise the tests variable in runMultipleSubtests.sh and do as before:

  bash ./runMultipleSubtests.sh <test_speaker>

Remember to adapt path.sh to your necessities.

TODO

Update the Deep Learning scripts
Add a RESULTS file

Authors

Cristina España-Bonet
(especific scripts only for the Torgo database: ./local)

Citation

Cristina España-Bonet and José A. R. Fonollosa Automatic Speech Recognition with Deep Neural Networks for Impaired Speech Chapter in Advances in Speech and Language Technologies for Iberian Languages, part of the series Lecture Notes in Artificial Intelligence. In A. Abad et al. (Eds.). IberSPEECH 2016, LNAI 10077, Chapter 10, pages 97-107, October 2016.

References

[1] Frank Rudzicz, Aravind Kumar Namasivayam and Talya Wolff. The TORGO database of acoustic and articulatory speech from speakers with dysarthria. Language Resources and Evaluation, December 2012, Volume 46, Issue 4, pp 523-541

asrdys's People

Contributors

cristinae avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

asrdys's Issues

Lattice Error while calling make_denlats.sh in run_dnn.sh script

Hi Cristinae,
(For speaker M01 from Torgo database)
I am actually running the dnn scripts, so I compute MFCCs -> LDA-> MLLT-> SAT , which gives me tri3b model in exp folder. Final three scripts before calling run_dnn.sh are
steps/train_sat.sh
steps/decode_fmllr.sh
steps/align_fmllr.sh

Then I run run_dnn.sh , I call scripts for
storing fmllr features
pre train DBN
Train DNN optimizing per-frame cross entropy
Till here everything works fine.
Now, for retraining DNN by 6 iterations of sMBR, which is done using following scripts

if [ $stage -le 3 ]; then

First we generate lattices and alignments:

echo "Generate lattices and alignments"
#-- steps/nnet/align.sh --nj $nj --cmd "$train_cmd"
#-- $data_fmllr/train data/lang $srcdir ${srcdir}_ali
** steps/nnet/make_denlats.sh --nj $nj --cmd "$decode_cmd" --config conf/decode_dnn.config --acwt $acwt
$data_fmllr/train data/lang $srcdir ${srcdir}_denlats || exit 1;
fi**

Alignment script works fine but when I run make_denlats.sh, I get the following error

run.pl: job failed, log is in exp/dnn4b_pretrain-dbn_dnn_denlats/log/decode_den.1.log

Following is the output of the log file, Check the last line.

latgen-faster-mapped --beam=30.0 --lattice-beam=18.0 --acoustic-scale=0.1 --max-mem=20000000 --max-active=5000 --word-symbol-table=data/lang/words.txt exp/dnn4b_pretrain-dbn_dnn/final.mdl exp/dnn4b_pretrain-dbn_dnn_denlats/dengraph/HCLG.fst 'ark,s,cs:copy-feats scp:data-fmllr-tri3b/train/split1/1/feats.scp ark:- | nnet-forward --no-softmax=true --prior-scale=1.0 --feature-transform=exp/dnn4b_pretrain-dbn_dnn/final.feature_transform --class-frame-counts=exp/dnn4b_pretrain-dbn_dnn/ali_train_pdf.counts --use-gpu=yes exp/dnn4b_pretrain-dbn_dnn/final.nnet ark:- ark:- |' scp:exp/dnn4b_pretrain-dbn_dnn_denlats/lat.store_separately_as_gz.scp
copy-feats scp:data-fmllr-tri3b/train/split1/1/feats.scp ark:-
nnet-forward --no-softmax=true --prior-scale=1.0 --feature-transform=exp/dnn4b_pretrain-dbn_dnn/final.feature_transform --class-frame-counts=exp/dnn4b_pretrain-dbn_dnn/ali_train_pdf.counts --use-gpu=yes exp/dnn4b_pretrain-dbn_dnn/final.nnet ark:- ark:-
WARNING (nnet-forward:SelectGpuId():cu-device.cc:182) Suggestion: use 'nvidia-smi -c 3' to set compute exclusive mode
LOG (nnet-forward:SelectGpuIdAuto():cu-device.cc:300) Selecting from 1 GPUs
LOG (nnet-forward:SelectGpuIdAuto():cu-device.cc:315) cudaSetDevice(0): GeForce GTX TITAN X free:12151M, used:136M, total:12287M, free/total:0.988892
LOG (nnet-forward:SelectGpuIdAuto():cu-device.cc:364) Trying to select device: 0 (automatically), mem_ratio: 0.988892
LOG (nnet-forward:SelectGpuIdAuto():cu-device.cc:383) Success selecting device 0 free mem ratio: 0.988892
LOG (nnet-forward:FinalizeActiveGpu():cu-device.cc:225) The active GPU is [0]: GeForce GTX TITAN X free:12134M, used:153M, total:12287M, free/total:0.987508 version 5.2
LOG (nnet-forward:main():nnet-forward.cc:93) Removing from the nnet exp/dnn4b_pretrain-dbn_dnn/final.nnet
LOG (nnet-forward:PdfPrior():nnet-pdf-prior.cc:34) Computing pdf-priors from : exp/dnn4b_pretrain-dbn_dnn/ali_train_pdf.counts
LOG (nnet-forward:PdfPrior():nnet-pdf-prior.cc:64) Floored 0 pdf-priors (hard-set to 1.84467e+19, which disables DNN output when decoding)
/home/sbp3624/Torgo/data/F01-Session1-arrayMic-0006 STICK
LOG (latgen-faster-mapped:RebuildRepository():determinize-lattice-pruned.cc:283) Rebuilding repository.
sh: exp/dnn4b_pretrain-dbn_dnn_denlats/lat1//home/sbp3624/Torgo/data/F01-Session1-arrayMic-0006.gz: No such file or directory

I tried to compare the scripts with kaldi's tedium/s5 , and they are also doing the same thing. As far as I remember, I don't think so I missed any of the steps. Obviously, I did not execute MMI, MPE, etc scripts because I do not need them. Can you guide me on what could be the possible issue?

translation failed after preparing pronunciations for OOV words in torgo_prepare_dict

My system is ubuntu 14.04 and running python 2.7.
In torgo_prepare_dict shell script when the script runs
g2p.py --model=conf/g2p_model --apply $locdict/vocab-oov.txt > $locdict/lexicon-oov.txt,
I get the following error
failed to convert "-pau-": translation
failed to convert "Aluminum": translation failed
similarly for all the words in vocab-oov.txt file.
screen shot 2018-06-01 at 12 23 10 pm

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.