Giter Club home page Giter Club logo

wgansing's Introduction

WGANSing: A Multi-Voice Singing Voice Synthesizer Based on the Wasserstein-GAN

Pritish Chandna, Merlijn Blaauw, Jordi Bonada, Emilia Gómez

Music Technology Group, Universitat Pompeu Fabra, Barcelona

This repository contains the source code for multi-voice singing voice synthesis

Installation

To install, clone the repository and use
pip install -r requirements.txt 
to install the packages required.

The main code is in the main.py file.

Training and inference

To use the WGANSing, you will have to download the model weights and place it in the log_dir directory, defined in config.py.

The NUS-48E dataset can be downloaded from here. Once downloaded, please change wav_dir_nus in config.py to the same directory that the dataset is in.

To prepare the data for use, please use prep_data_nus.py.

Once setup, you can run the following commands. To train the model:

python main.py -t
.

To synthesize a .lab file: Use

python main.py -e filename alternate_singer_name 

If no alternate singer is given then the original singer will be used for synthesis. A list of valid singer names will be displayed if an invalid singer is entered.

You will also be prompted on wether plots showed be displayed or not, press y or Y to view plots.

Acknowledgments

The TITANX used for this research was donated by the NVIDIA Corporation. This work is partially supported by the Towards Richer Online Music Public-domain Archives (TROMPA) (H2020 770376) European project.

[1] Duan, Zhiyan, et al. "The NUS sung and spoken lyrics corpus: A quantitative comparison of singing and speech." 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. IEEE, 2013.

[2] Blaauw, Merlijn, and Jordi Bonada. "A Neural Parametric Singing Synthesizer Modeling Timbre and Expression from Natural Songs." Applied Sciences 7.12 (2017): 1313.

[3] Blaauw, Merlijn, et al. “Data efficient voice cloning for neural singing synthesis,” in2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019.

wgansing's People

Contributors

pc2752 avatar kant avatar tom5079 avatar dependabot[bot] avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.