f90 / adversarialaudioseparation Goto Github PK

View Code? Open in Web Editor NEW

82.0 6.0 15.0 1.58 MB

Code accompanying the paper "Semi-supervised adversarial audio source separation applied to singing voice extraction"

Home Page: https://arxiv.org/abs/1711.00048

License: MIT License

Python 100.00%

deep-learning audio-source-separation adversarial-networks semi-supervised-learning mit-license paper audio

adversarialaudioseparation's Issues

Dataset links

Hi,
Do you have the links to the datasets you use? I am new to these datasets, but the paper is very interesting. I want to reproduce the results. However, it is not easy to find them and how you change the downloaded dataset to the formatted dataset. Though the dataset structure is described in README, it is still not clear to me that how a formatted dataset should be. Do you mind elaborating on that a little bit, like giving us a sample dataset in the repo?
Thanks!

Stuck on "Loading new item into cache from data list starting with"

Hi,

Thanks for your excellent code. I am trying to reproduce your results, however, the training will stuck on loading some item into cache. Below is the running output log:

Do you have any idea about that? The program stuck on this line: self.update_next_cache_item(self.communication_queue.get())

Pretrained model?

Hi,

I know this is a bit late, it would be nice if a pretrained model were available for download, to easily recreate the original results and for use on custom audio.

In lieu of that, I'm trying to recreate the experiment, but I'm having some difficulty. Although the readme helpfully explains what to do, I'm not sure if I can obtain the same datasets. iKala is apparently no longer available at all, and MedleyDB is only available on request. I guess I'll try training using only the other two...

is it a python2 project?

is it a python2 project? i have errors such as import cPickle and audiolab.

Normalizing spectrograms

As far as I know, the result of log1p(x) can be negative. You use this function to 'normalize' the spectrograms of target accompaniment and vocals and then use the difference between network outputs and these spectrograms in your loss function. However, network outputs after ReLU can't be negative.
I see the paper and I realize that it must work, so what do I miss? Please help me

Fully adversarial training

Before I try it myself, I wanted to ask if you tried training the network without finetuning and starting from scratch with a fully adversarial training. Is that too hard to train? Did you try some other conditional GAN flavors?

f90 / adversarialaudioseparation Goto Github PK

adversarialaudioseparation's Issues

Dataset links

Stuck on "Loading new item into cache from data list starting with"

Pretrained model?

is it a python2 project?

Normalizing spectrograms

Fully adversarial training

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent