Giter Club home page Giter Club logo

noisy_label_understanding_utilizing's Introduction

Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels

This is a Keras implementation for the paper Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels (Proceedings of ICML, 2019).

@inproceedings{chen2019understanding,
  title={Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels},
  author={Chen, Pengfei and Liao, Ben Ben and Chen, Guangyong and Zhang, Shengyu},
  booktitle={International Conference on Machine Learning},
  pages={1062--1070},
  year={2019}
}

Dependencies

Python 3.6.4, Keras 2.1.6, Tensorflow 1.7.0, numpy, sklearn.

Please be aware of the bug caused by different versions of Keras/tf. For example, in the callback functions in model.fit_generator, new keras versions use "val_accuracy" instead of "val_acc", for which you may not directly get an error but may fail to save the model. Please check the Documentation of Keras carefully if you use a different version.

Setup

To set up experiments, we need to download the CIFAR-10 data and extract it to:

data/cifar-10-batches-py

The code will automatically add noise to CIFAR-10 by randomly flipping original labels.

Understanding noisy labels

Note

To quantitatively characterize the generalization performance of deep neural networks normally trained with noisy labels, we split the noisy dataset into two halves and perform cross-validation: training on a subset and testing on the other.

We firstly theoretically characterize on the test set the confusion matrix (w.r.t. ground-truth labels) and test accuracy (w.r.t. noisy labels).

We then propose to select a testing sample as a clean one, if the trained model predict the same label with its observed label. The performance is evaluated by label precision and label recall, which can be theoretically estimated using the noise ratio according to our paper.

Train

Experimrental resluts justify our theoretical analysis. To reproduce the experimental results, we can run Verify_Theory.py and specify the noise pattern and noise ratio, e.g.,

  • Symmetric noise with ratio 0.5:

    python Verify_Theory.py --noise_pattern sym --noise_ratio 0.5

  • Asymmetric noise with ratio 0.4:

    python Verify_Theory.py --noise_pattern asym --noise_ratio 0.4

Results

Test accuracy, label precision and label recall w.r.t noise ratio on manually corrupted CIFAR-10.

Confusion matrix M approximates noise transistion matrix T.

Simply cleaning noisy datasets

Train

If you only want to use INCV to clean a noisy dataset, you can run INCV.py only, e.g., on CIFAR-10 with

  • 50% symmetric noise:

    python INCV.py --noise_pattern sym --noise_ratio 0.5 --dataset cifar10

  • 40% asymmetric noise:

    python INCV.py --noise_pattern asym --noise_ratio 0.4 --dataset cifar10

The results will be saved in 'results/(dataset)/(noise_pattern)/(noise_ratio)/(XXX.csv)' with columns ('y', 'y_noisy', 'select', 'candidate', 'eval_ratio').

Results

label precision and label recall on the manually corrupted CIFAR-10.

Our INCV accurately identifies most clean samples. For example, under symmetric noise of ratio 0.5, it selects about 90% (=LR) of the clean samples, and the noise ratio of the selected set is reduced to around 10% (=1−LP).

Cleaning noisy datasets and robustly training deep neural networks

Note

We present the Iterative Noisy Cross-Validation (INCV) to select a subset of clean samples, then modify the Co-teaching strategy to train noise-robust deep neural networks.

Train

E.g., use our method to train on CIFAR-10 with

  • 50% symmetric noise:

    python INCV_main.py --noise_pattern sym --noise_ratio 0.5 --dataset cifar10

  • 40% asymmetric noise:

    python INCV_main.py --noise_pattern asym --noise_ratio 0.4 --dataset cifar10

Results

Average test accuracy (%, 5 runs) with standard deviation:

Method Sym. 0.2 Sym. 0.5 Sym. 0.8 Aym. 0.4
F-correction 85.08±0.43 76.02±0.19 34.76±4.53 83.55±2.15
Decoupling 86.72±0.32 79.31±0.62 36.90±4.61 75.27±0.83
Co-teaching 89.05±0.32 82.12±0.59 16.21±3.02 84.55±2.81
MentorNet 88.36±0.46 77.10±0.44 28.89±2.29 77.33±0.79
D2L 86.12±0.43 67.39±13.62 10.02±0.04 85.57±1.21
Ours 89.71±0.18 84.78±0.33 52.27±3.50 86.04±0.54

Average test accuracy (%, 5 runs) during training:

Cite

Please cite our paper if you use this code in your research work.

Questions/Bugs

Please submit a Github issue or contact [email protected] if you have any questions or find any bugs.

noisy_label_understanding_utilizing's People

Contributors

chenpf1025 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

noisy_label_understanding_utilizing's Issues

Results on WebVision

Hi,
thanks for your great work.
I'm so sorry to bother you because this repo is a few years old, but I'd still like to know about your setup for experiments on webvision. As you may know, the results of this article are widely used in many articles, mostly the baseline numbers on webvsion, but after doing direct supervised learning without any modifications[no small loss things], I found that the results of the webvsion dataset can reach top1 acc ~75. I use 64batchsize, inceptionresnetv2, 100epochs, lr=0.01, changing to 0.001 and 0.0001 in 50th and 80th epoch. I'm worried whether most of the current works are really improving or not?

Best,
Chen

Required Version

First I would like to thank you for making your work easy to replicate. I believe this is a truly inspiring paper.

I am simply writing to ask what is the version of tensorflow and keras that you run these experiments with?

Issue on theoretic result corresponding to Zhang Chiyuan ICLR-17 paper

Hi! Thank you for your interesting paper. But I have a small issue.
In this paper, it says "Interestingly, Eq.(4) perfectly fits the experimental results of generalization accuracy shown in Fig. 1(c) of (Zhang et al. 2017), ..."

But in (Zhang et al. 2017), it seems that the test accuracy is tested on clean labels.
And in this paper, Eq.(4): P(y^f = y) = (1-epsilon)^2 + epsilon^2/(c-1), y is noisy label according to notation in this paper.
So maybe they can't correspond to each other.

Confused about computing the estimated noise ratio ε

Firstly, thank you for your paper, I am trying to implement it in a language identification setup with asymmetric / real-word noisy labels. However, I have a hard time understanding how you compute the estimated noise ratio ε (step 8 in the pseudo-code of algorithm 2 in the paper, leading to Eq. 4). I have already looked at the paper and the code, but I am still confused about it. Would you be so kind to explain a bit how that is calculated?

cannot run INCV individually

Hi Pengfei,

As you said, if we only want to use INCV to clean a noisy dataset, we can run INCV.py only. However, it seems we still need to model trained from Verify_Theory.py. Is that correct or we will need to acquire the model in advance indeed?

Thank you!

tabular data/ noisy instances/ new datasets

Hi,
thanks for sharing your implementation. I have some questions about it:

  1. Does it also work on tabular data?
  2. Is the code tailored to the datasets used in the paper or can one apply it to any data?
  3. Is it possible to identify the noisy instances (return the noisy IDs or the clean set)?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.