Giter Club home page Giter Club logo

aishell-4's Introduction

AISHELL-4

This project is associated with the recently-released AIHSHELL-4 dataset for speech enhancement, separation, recognition and speaker diarization in conference scenario. The project, served as baseline, is divided into five parts, named data_preparation, front_end, asr and sd. The Speaker Independent (SI) task only evaluates the ability of front end (FE) and ASR models, while the Speaker Dependent (SD) task evaluates the joint ability of speaker diarization, front end and ASR models. The goal of this project is to simplify the training and evaluation procedure and make it easy and flexible for researchers to carry out experiments and verify neural network based methods.

Setup

git clone https://github.com/felixfuyihui/AISHELL-4.git
pip install -r requirements.txt

Introduction

  • Data Preparation: Prepare the training and evaluation data.
  • Front End: Train and evaluate the front end model.
  • ASR: Train and evaluate the asr model.
  • Speaker Diarization: Generate the speaker diarization results.
  • Evaluation: Evaluate the results of models above and generate the CERs for Speaker Independent and Speaker Dependent tasks respectively.

General steps

  1. Generate training data for fe and asr model and evaluation data for Speaker Independent task.
  2. Do speaker diarization to generate rttm which includes vad and speaker diarization information.
  3. Generate evaluation data for Speaker Dependent task with the results from step 2.
  4. Train FE and ASR model respectively.
  5. Generate the FE results of evaluation data for Speaker Independent and Speaker Dependent tasks respectively.
  6. Generate the ASR results of evaluation data for Speaker Independent and Speaker Dependent tasks respectively with the results from step 2 and 3 for No FE results.
  7. Generate the ASR results of evaluation data for Speaker Independent and Speaker Dependent tasks respectively with the results from step 5 for FE results.
  8. Generate CER results for Speaker Independent and Speaker Dependent tasks of (No) FE with the results from step 6 and 7 respectively.

Citation

If you use this challenge dataset and baseline system in a publication, please cite the following paper:

@article{fu2021aishell,
         title={AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario},
         author={Fu, Yihui and Cheng, Luyao and Lv, Shubo and Jv, Yukai and Kong, Yuxiang and Chen, Zhuo and Hu, Yanxin and Xie, Lei and Wu, Jian and Bu, Hui and Xin, Xu and Jun, Du and Jingdong Chen},
         year={2021},
         conference={Interspeech2021, Brno, Czech Republic, Aug 30 - Sept 3, 2021}
         }

The paper is available at https://arxiv.org/abs/2104.03603

Dataset is available at http://www.openslr.org/111/ and http://www.aishelltech.com/aishell_4

Contributors

Code license

Apache 2.0

aishell-4's People

Contributors

felixfuyihui avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.