Giter Club home page Giter Club logo

neuralsvb's Introduction

Learning the Beauty in Songs: Neural Singing Voice Beautifier

Jinglin Liu, Chengxi Li, Yi Ren, Zhiying Zhu, Zhou Zhao

Zhejiang University

ACL 2022 Main conference


arXiv GitHub Stars visitors

🚧 ⛏️ 🛠️ 👷

This repository is the official PyTorch implementation of our ACL-2022 paper. Now, we release the codes for SADTW algorithm and dataset (PopBuTFy) in our paper. Please wait for other codes and pre-trained models.

|--modules
    |--voice_conversion
        |--dtw
            |--enhance_sadtw.py  (Our algorithm)
|--tasks
    |--singing
        |--pitch_alignment_task.py  (Usage example)

🚀 News:

  • Feb.24, 2022: Our new work, NeuralSVB was accepted by ACL-2022. Demo Page.
  • Dec.01, 2021: Our recent work DiffSinger was accepted by AAAI-2022. downloads | .
  • Sep.29, 2021: Our recent work PortaSpeech was accepted by NeurIPS-2021. .
  • May.06, 2021: We submitted DiffSinger to Arxiv arXiv.

Dataset (PopBuTFy) Acquirement

Audio samples

Text labels

NeuralSVB does not need text as input, but the ASR model to extract PPG needs text. Thus we also provide the text labels of PopBuTFy.

Citation

If this repository helps your research, please cite:

@inproceedings{liu-etal-2022-learning-beauty,
title = "Learning the Beauty in Songs: Neural Singing Voice Beautifier",
author = "Liu, Jinglin  and
  Li, Chengxi  and
  Ren, Yi  and
  Zhu, Zhiying  and
  Zhao, Zhou",
booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = may,
year = "2022",
address = "Dublin, Ireland",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.acl-long.549",
pages = "7970--7983",}

Abstract

We are interested in a novel task, singing voice beautifying (SVB). Given the singing voice of an amateur singer, SVB aims to improve the intonation and vocal tone of the voice, while keeping the content and vocal timbre. Current automatic pitch correction techniques are immature, and most of them are restricted to intonation but ignore the overall aesthetic quality. Hence, we introduce Neural Singing Voice Beautifier (NSVB), the first generative model to solve the SVB task, which adopts a conditional variational autoencoder as the backbone and learns the latent representations of vocal tone. In NSVB, we propose a novel time-warping approach for pitch correction: Shape-Aware Dynamic Time Warping (SADTW), which ameliorates the robustness of existing time-warping approaches, to synchronize the amateur recording with the template pitch curve. Furthermore, we propose a latent-mapping algorithm in the latent space to convert the amateur vocal tone to the professional one. Extensive experiments on both Chinese and English songs demonstrate the effectiveness of our methods in terms of both objective and subjective metrics.

Issues

  • Before raising a issue, please check our Readme and other issues for possible solutions.
  • We will try to handle your problem in time but we could not guarantee a satisfying solution.
  • Please be friendly.

neuralsvb's People

Contributors

moonintheriver avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.