The abc_asr from aaaceo890

abc_asr's Introduction

Multi-modal Speech Recognition for ABCS Corpus

This respository is the official implementation of "End-to-End Multi-Modal Speech Recognition on an Air and Bone Conducted Speech Corpus" for TASLP 2023.

Installation

If you just need the module only, run
```
pip install espnet
```
first, and you can use the modules in abc_asr/model.
If you want to do full experiments, you need to correctly install ESPnet and kaldi first. See Installation.

Next, run
```
pip install -r requirements.txt
```
to install the required packages.

Data Preparation

Download dataset.

Download the ABCS Corpus here: Links.

Download the noisy air conducted data (ns_air_data.zip) here: [Onedrive] or [Baidu Cloud]

Unzip the noisy data into ABCS's directory:
```
unzip -d <ABCS dir>/Audio/ ns_air_data.zip
```

Execute the data preparation script.

For inference only:

python3 data_prep --dataset_root <ABCS dir> --test

For full experiments:

python3 data_prep --dataset_root <ABCS dir>

Inference

Ensure that kaldi and ESPnet are properly installed on your environment. Next, have correctly adjust the third line in test.sh:
```
export ESPNETROOT=<Your Espnet Root>
```
Download the model parameters file here [Onedrive] or [Baidu Cloud]
```
mv model.acc.best <Your Path>/abc_asr/results
```
Run
```
bash test.sh
```

Results (CER %)

	SNR=-5dB	SNR=0dB	SNR=5dB	SNR=10dB	SNR=15dB	SNR=20dB	Clean
The proposed MMT	17.5	14.9	11.8	9.4	7.9	7.1	6.7

TODO

The training pipeline.

Citing

If you found this code helpful, please consider citing it as follows:

@ARTICLE{9961873,
  author={Wang, Mou and Chen, Junqi and Zhang, Xiao-Lei and Rahardja, Susanto},
  journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing}, 
  title={End-to-End Multi-Modal Speech Recognition on an Air and Bone Conducted Speech Corpus}, 
  year={2023},
  volume={31},
  number={},
  pages={513-524},
  keywords={Speech recognition;Speech processing;Signal to noise ratio;Spectrogram;Headphones;Microphones;Synchronization;Speech recognition;multi-modal speech processing;bone conduction;air- and bone-conducted speech corpus},
  doi={10.1109/TASLP.2022.3224305}}

abc_asr's People

Contributors

Stargazers

Watchers

abc_asr's Issues

some questions

When I was executing the step of ''python3 data_prep --dataset_root '', I encountered the following problem. What should I do to solve it? I couldn't find the CDPR in the directory, and there is another issue: thread>=2.0.0 in the requirements. What should I install?Thank you for sharing the code and dataset. I am really interested in this project and hope to replicate it myself. I am a beginner and I hope to receive your answer!

Traceback (most recent call last):
File "data_prep.py", line 12, in
from modules.DataGenerator import BoneConductDataGenerator, BoneConductNSDataGenerator
File "/mnt/e/code/abc_asr-master/preprocessing/modules/DataGenerator/BoneConductDataGenerator.py", line 1, in
from CDPR.modules.DataGenerator.data_generator import DataGenerator as InterFace
ModuleNotFoundError: No module named 'CDPR'

Recommend Projects

aaaceo890 / abc_asr Goto Github PK

abc_asr's Introduction

Multi-modal Speech Recognition for ABCS Corpus

Installation

Data Preparation

Inference

Results (CER %)

TODO

Citing

abc_asr's People

Contributors

Stargazers

Watchers

abc_asr's Issues

some questions

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent