Giter Club home page Giter Club logo

wavecrn's Introduction

WaveCRN

WaveCRN: An Efficient Convolutional Recurrent Neural Network for End-to-end Speech Enhancement

This repo is an example usage of the proposed model. example/main.ipynb shows a minimal SE pipline and a visulization of the enhanced speech sample.

Model Architecture

image The architecture of the proposed WaveCRN model. For local feature extraction, a 1D CNN maps the noisy audio x into a 2D feature map F. Bi-SRU then encodes F into an restricted feature mask (RFM) M, which is element-wisely multiplied by F to generate a masked feature map F'. Finally, a transposed 1D convolution layer recovers the enhanced waveform y from F'.

Experimental Results

Results of Voice Bank + Demand Dataset

Results of TIMIT for Compressed Speech Restoration

Requirements

torch==1.4.0
sru==2.3.5

Q&A

Q:Table 1 shows that the proposed method, WaveCRN, performs slightly better than Wave-U-Net but without confidence intervals it is not possible to ascertain whether these differences are statistically significant or just happened by chance.

A: We would like to emphasize that the major advantage of WaveCRN is that it can achieve comparable performance to state-of-the-art SE methods (such as Wave-U-Net [36]) while requiring much less model complexity and computational costs. In the revised paper, we have clearly highlighted the main advantages of WaveCRN: (a) In Section III-B, we reported β€œIt can be clearly seen from Table I that WaveCRN outperforms other models in terms of all perceptual and signal-level evaluation metrics.” (b) From Tables I and II, It is clear that WaveCRN uses a smaller size and less computational costs and achieves comparable performance to WaveCBLSTM. Table II shows that WaveCRN is 11.48/5.56 times faster than Wave-U-Net in forward/back-propagation pass in the training stage, while having only 27% of the parameter number as compared to Wave-U-Net. Due to the space limitation, we did not report the model size and the computational cost in Table II. Instead, we report Table II.R1 in our GitHub page (https://github.com/aleXiehta/WaveCRN). If you advise us to include the result, we will use Table II.R1 to replace Table II in the current manuscript.

wavecrn's People

Contributors

alexiehta avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.