Giter Club home page Giter Club logo

microphone-array-generalization-for-multichannel-narrowband-deep-speech-enhancement's Introduction

Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement

This repository for the official PyTorch implementation of Microphone Array Generalization for Multichannel Narrowband Deep Speech Enhancement, accepted by InterSpeech 2021.

Introduction

Our work addresses the problem of microphone array generalization for deep-learning-based end-to-end multichannel speech enhancement. We aim to train a unique potentially performing well on unseen microphone arrays. The goal is to make the network learn the universal information for speech enhancement that is available for any array geometry, rather than learn the one-array-dedicated characteristics. To resolve this problem, a single network is trained using data recorded by various VIRTUAL microphone arrays of different geometries using RIR Generator[1] and simulated diffused noise[2]. We design three variants of our recently proposed original NarrowBand Deep Filtering(NBDF) [3] network to cope with the agnostic number of microphones.

figure 1

Key Features

  • Simulated_RIR_Generator
  • Network
    • original NBDF (CP-NBDF)
    • CC-NBDF
    • PW-NBDF
  • Train
  • Inference
  • Evaluation

Get started

(1) Clone:

$ git clone https://github.com/atomicoo/Tacotron2-PyTorch.git

(2) Requirements:

$ pip install -r requirements.txt

RIR Generator [1], coherent multichannel noise generator[2] and wind noise simulator [4] are also required.

Reference

[1] E. A. Habets, “Room impulse response generator,” Technische Universiteit Eindhoven, Tech. Rep, vol. 2, no. 2.4, p. 1, 2006.

[2] E. A. Habets, I. Cohen, and S. Gannot, “Generating nonstationary multisensor signals under a spatial coherence constraint,” The Journal of the Acoustical Society of America, vol. 124, no. 5, pp. 2911–2917, 2008.

[3] X. Li and R. Horaud, “Narrow-band deep filtering for multichannel speech enhancement,” arXiv preprint arXiv:1911.10791, 2019.

[4] D. Mirabilii and E. A. Habets, “Simulating multi-channel wind noise based on the corcos model,” in 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC).IEEE,2018, pp. 560–564.

microphone-array-generalization-for-multichannel-narrowband-deep-speech-enhancement's People

Contributors

russelzhang avatar

Watchers

 avatar

Forkers

runngezhang

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.