Giter Club home page Giter Club logo

mcnet's Introduction

McNet

The official repo: McNet: Fuse Multiple Cues for Multichannel Speech Enhancement accepted by ICASSP 2023 (https://arxiv.org/pdf/2211.08872.pdf). Examples can be found at https://audio.westlake.edu.cn/Research/McNet.htm.

Table 1. Performance of offline speech enhancement.* means scores are quoted from the original papers.

Method NB-PESQ WB-PESQ STOI SDR
Noisy 1.82 1.27 87.0 7.5
MNMF Beamforming * [20] - - 94.0 16.2
Oracle MVDR 2.49 1.94 97.0 17.3
CA Dense U-net * [12] - 2.44 - 18.6
Narrow-band Net [11] 2.74 2.13 95.0 16.6
FT-JNF [14] 3.17 2.48 96.2 17.7
McNet (prop.) 3.38 2.73 97.6 19.6

Table 2. Performance of online speech enhancement.

Method NB-PESQ WB-PESQ STOI SDR
Noisy 1.82 1.27 87.0 7.5
Narrow-band Net [11] 2.70 2.15 94.7 16.0
FT-JNF [14] 2.80 2.23 95.4 16.9
McNet (prop.) 3.29 2.67 97.2 19.0

Train & Test

Reminder: This project is built on the pytorch-lightning package, in particular its command line interface (CLI). To understand the commands below and config file, you need to have some basic knowledge about the CLI in lightning.

Train:

python McNetCLI.py fit --config config\mc_net_online.yaml

Test:

python McNetCLI.py test --config config\mc_net_online.yaml

If you want to use our pretrained model,

python McNetCLI.py test --config config/mc_net_offline.yaml  --trainer.gpus 0,1  --ckpt_path model_checkpoints/offline/epoch494_criteria18.78_sdr18.78.ckpt

Update

3.24 Add predict module

python McNetCLI.py predict --config config/mc_net_offline.yaml  --trainer.gpus 0,1  --ckpt_path model_checkpoints/offline/epoch494_criteria18.78_sdr18.78.ckpt

mcnet's People

Contributors

quancs avatar yang-yujie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

mcnet's Issues

Question: Pretrained Model

Hi,

Thanks for the work. I am curious about pretrained model. Will you released your pretrained model or not?

McNet layers and McNetIO

Thanks for the pretrained model ! When I load model from checkpoint I see io channel from 0 to 5. What does it mean ?
How can I use checkpoint for an audio file for prediction( only enhancement ) ?
Could you explain model forward also?

使用其他数据集进行训练

你好,我想使用其他数据集的多个单通语音来作为模型多通道的输入进行训练,但代码(config/.yaml)中noisy_dataset_dir只有一个路径。
请问noisy data是用何种方式来进行存取?建议如何修改代码?

训练数据

请问如何获取训练数据呢,~/simu-data/training_dataset/clean_speech/中的数据

关于数据集的预处理问题

您好,我看程序中使用数据集的路径和原始的CHiME-4数据集结构似乎不太相同,请问在拿到CHiME-4数据集后是需要进行下预处理吗?可以提供下训练时使用的数据集结构和处理程序嘛?谢谢!如果能有说明的话就更好了 :)

How well does it work on real data?

Hi, do you have any requirements for the formation of the array? I used the pre-trained model to run the actual data, and the effect was very bad.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.