Comments (8)
Hi @va-volokhov , I also faced a similar situation.
The difference was made from train-other-500. If I used them, I also got a similar score as you did (SDR: 4~5), but if I did not include train-other-500 (train-clean-100, train-clean-360 and other smaller sets are included), I could get around 10 dB as well. But it doesn't mean that you shouldn't include them (more data will be the better result).
The chosen test dataset caused this difference. As you can see from @seungwonpark 's evaluation code, there is only one validation. Therefore if your test data were selected from the different folder, you would get a different result.
However, It will be still difficult to achieve a similar score from the paper. In the original paper, the median and mean SDR is already high even before the separation, which means when it mixes two audio files, interference audio is not fully interfering the all clean audio (shorter length maybe?). Therefore, some of the rear parts of mixed audio may be just clean audio.
Therefore, the easiest way to compare your performance is, as @seungwonpark said, comparing to the published sample. And the original paper's published sample as well (https://google.github.io/speaker-id/publications/VoiceFilter/).
By the way, thank you for the excellent implementation @seungwonpark, including all the preprocessing.
from voicefilter.
Actually, the resulting SDR strongly depends on the test data that we use.
Try to run inference.py
with audio samples shown in http://swpark.me/voicefilter/, and compare the resulting quality with mine by listening to them.
from voicefilter.
I used batch_size=12
, so this won't affect much.
from voicefilter.
Hi @thejungwon, thank you for your answer. Yes, after excluding the train-other-500 subset from training and testing on data from dev-clean, the SDR behavior becomes similar to that of @seungwonpark. Thank you for the help @thejungwon! Thank you for the excellent implementation @seungwonpark!
from voicefilter.
Hi @thejungwon, thanks for pointing out that excluding train-other-500
is helpful for training! I will make a commit that excludes train-other-500
from generator.py
.
I would also like to thank @va-volokhov for kindly sharing this issue here.
from voicefilter.
@va-volokhov @seungwonpark Can you please suggest how to use 2 GPU to train using this code. Using single GPU is too slow. Is there any degradation in performance when we move from 1 to 2 GPU. @va-volokhov Seems you have used even more GPUs. Do you mind sharing the code snippets that can allow us to use this code on 2 GPUs? Thanks guys!
from voicefilter.
Hi @thejungwon, thank you for your answer. Yes, after excluding the train-other-500 subset from training and testing on data from dev-clean, the SDR behavior becomes similar to that of @seungwonpark. Thank you for the help @thejungwon! Thank you for the excellent implementation @seungwonpark!
do you mean this?
training: train-clean-360; train-clean-100
testing: test-clean
skip files: train-other-500; dev-clean; dev-other; test-other
and i see author use train-clean-360 and train-clean-100 on training, dev-clean on testing, in the latest code. unfortunately, i trained a bad SDR
hope your reply!
from voicefilter.
Hi @thejungwon, thank you for your answer. Yes, after excluding the train-other-500 subset from training and testing on data from dev-clean, the SDR behavior becomes similar to that of @seungwonpark. Thank you for the help @thejungwon! Thank you for the excellent implementation @seungwonpark!
do you mean this? training: train-clean-360; train-clean-100 testing: test-clean skip files: train-other-500; dev-clean; dev-other; test-other and i see author use train-clean-360 and train-clean-100 on training, dev-clean on testing, in the latest code. unfortunately, i trained a bad SDR hope your reply!
I have encounter the same problem, do you have a way to solve it?
from voicefilter.
Related Issues (20)
- Question about normalize-resample.sh HOT 4
- Question when preprocessing wav files HOT 1
- Question when training VoiceFilter HOT 1
- Question about utils/evaluation.py HOT 2
- Need to try power-law compression loss HOT 2
- embedder.pt with new dataset HOT 4
- Can you get the initial mean SDR on LibriSpeech using Google's test list? HOT 8
- hop_length and win_length
- the model implementation comprehension HOT 1
- inference
- Can I get the pretrained model please! I so dearly need it for my project, here's my email just in case, [email protected] HOT 1
- Training setting problem HOT 6
- Is the VoiceFilter model checkpoint available to be used directly? HOT 1
- question about ffmpeg-normalize
- Question about wav2spec function in utils/audo.py
- Cannot reproduce reported SDR & retrain the speaker embedding
- how to work for multi noise
- What is the term spk refers to in the below code ? line 127
- how to create file embedder HOT 5
- Question about start point of SDR HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from voicefilter.