Giter Club home page Giter Club logo

vocoder-benchmark's Introduction

CC BY-NC License Codecov GitHub open issues

VocBench: A Neural Vocoder Benchmark for Speech Synthesis

PyTorch implementation for VocBench framework.

[arXiv]

Installation

  1. Python >= 3.6
  2. Get VocBench code
$ git clone https://github.com/facebookresearch/vocoder-benchmark.git
$ cd vocoder-benchmark
  1. Install dependencies
$ python3 -m venv vocbench
# activate the virtualenv
$ source vocbench/bin/activate
# Upgrade pip
$ python -m pip install --upgrade pip
# Install dependences
$ pip install -e .
  1. To use VocBench cli, make sure to set paths in your .bashrc or .bash_profile appropriately.
VOCODER_BENCHMARK=/path/to/vocoder-benchmark
export PATH=$VOCODER_BENCHMARK/bin:$PATH
  1. Change the binary file permission and test your installation
$ chomd +x $VOCODER_BENCHMARK/bin/vocoder
$ vocoder --help
Usage: cli.py [OPTIONS] COMMAND [ARGS]...

  Vocoder benchmarking CLI.

Options:
  --help  Show this message and exit.

Commands:
  dataset           Dataset processing.
  diffwave          Create, train, or use diffwave models.
  parallel_wavegan  Create, train, or use parallel_wavegan models.
  wavegrad          Create, train, or use wavegrad models.
  wavenet           Create, train, or use wavenet models.
  wavernn           Create, train, or use wavernn models.

Usage

Download dataset

$ vocoder dataset --help # For more information on how to download/split dataset

# e.g. download and split LJ Speech
$ vocoder dataset download --dataset ljspeech --path ~/local/datasets/lj # Download and unzip dataset files
$ vocoder dataset split --dataset ljspeech --path ~/local/datasets/lj  # Create train / validation / test splits

Training

$ vocoder [model-cmd] train --help

# e.g. train wavenet on LJ Speech dataset
$ vocoder wavenet train --path ~/local/models/wavenet --dataset ~/local/datasets/lj --config $VOCODER_BENCHMARK/config/wavenet_mulaw_normal.yaml

*For MelGAN and Parallel WaveGAN, they both use the same model cmd. You will need to choose the right configuration for each of them

# MelGAN
$ vocoder parallel_wavegan train --path ~/local/models/melgan --dataset ~/local/datasets/lj --config $VOCODER_BENCHMARK/config/melgan.v1.yaml

# Parallel WaveGAN
$ vocoder parallel_wavegan train --path ~/local/models/parallel_wavegan --dataset ~/local/datasets/lj --config $VOCODER_BENCHMARK/config/parallel_wavegan.yaml

Example of configuration files for each model is provided under config directory.

Synthesize

$ vocoder [model-cmd] synthesize --help
Usage: cli.py [model-cmd] synthesize [OPTIONS] INPUT_FILE OUTPUT_FILE

  Synthesize with the model.

Options:
  --path TEXT     Directory for the model  [required]
  --length TEXT   The length of the output sample in seconds
  --offset FLOAT  Offset in seconds of the sample
  --help          Show this message and exit.

Evaluate

$ vocoder [model-cmd] evaluate --help
Usage: cli.py [model-cmd] evaluate [OPTIONS]

  Evaluate a given vocoder.

Options:
  --path TEXT        Directory for the model  [required]
  --dataset TEXT     Name of the dataset to use  [required]
  --checkpoint TEXT  Checkpoint path (default: load latest checkpoint)
  --help             Show this message and exit.

*Frechet Audio Distance is currently not implemented. We use Google Research opensource repository to get FAD results.

Reference Repositories

License

The majority of VocBench is licensed under CC-BY-NC, however portions of the project are available under separate license terms: Wavenet, ParallelWaveGAN, and flops counter are licensed under the MIT license; diffwave is licensed under the Apache 2.0 license; WaveGrad is licensed under the BSD-3 license.

Used by

List of papers that used our work (Feel free to add your own paper by making a pull request)

vocoder-benchmark's People

Contributors

amyreese avatar ebadawy avatar shannonzhu avatar thatch avatar facebook-github-bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.