Giter Club home page Giter Club logo

k-wav2vec's Introduction

K-Wav2vec 2.0

This official implementation of "K-Wav2vec 2.0: Automatic Speech Recognition based on Joint Decoding of Graphemes and Syllables"

Requirements and Installation

  • PyTorch version >= 1.7.1
  • Python version >= 3.6
  • To install K-wav2vec and develop locally:
git clone https://github.com/Yoon-SeokJin/K-wav2vec
cd K-wav2vec

## install essential library
pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html

## install locally
python setup.py develop
  • We only test this implementation in Ubuntu 18.04.
  • DockerFile is also supported in this repo.

Instructions

  • We support script examples to execute code easily(check script folder)
  • Following this instruction give you exact matched results.
# Guilde to make multi-model with Ksponspeech(orthographic transcription) 

# [1] preprocess dataset & make manifest
bash script/preprocess/make_ksponspeech_script_for_mulitmodel.sh

# [2] further pre-train the model
bash script/pretrain/run_further_pretrain.sh
 
# [3] fine-tune the model
bash script/finetune/run_ksponspeech_multimodel.sh

# [4] inference the model
bash script/inference/evaluate_multimodel.sh

Pretrained model

  • E-Wav2vec 2.0 : Wav2vec 2.0 pretrained on Englsih dataset released by Fairseq(-py)
  • K-Wav2vec 2.0 : The model further pretrained on Ksponspeech by using Englsih model
    • Fairseq Version : If you want to fine-tune your model with fairseq framework, you can download with this LINK
    • Huggingface Version : If you want to fine-tune your model with huggingface framework, you can download with this LINK

Dataset

Acknowledgments

  • Our code was modified from fairseq codebase. We use the same license as fairseq.
  • The preprocessing code was developed with reference to Kospeech.

License

Our implementation code(-py) is MIT-licensed. The license applies to the pre-trained models as well.

k-wav2vec's People

Contributors

joungheekim avatar seokjin1013 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.