Giter Club home page Giter Club logo

yyzharry / imbalanced-semi-self Goto Github PK

View Code? Open in Web Editor NEW
733.0 14.0 116.0 16.94 MB

[NeurIPS 2020] Semi-Supervision (Unlabeled Data) & Self-Supervision Improve Class-Imbalanced / Long-Tailed Learning

Home Page: https://arxiv.org/abs/2006.07529

License: MIT License

Python 100.00%
imbalanced-learning imbalanced-classification semi-supervised-learning unlabeled-data self-supervised-learning long-tail long-tailed-recognition class-imbalance neurips neurips-2020

imbalanced-semi-self's Introduction

Rethinking the Value of Labels for Improving Class-Imbalanced Learning

This repository contains the implementation code for paper:
Rethinking the Value of Labels for Improving Class-Imbalanced Learning
Yuzhe Yang, and Zhi Xu
34th Conference on Neural Information Processing Systems (NeurIPS), 2020
[Website] [arXiv] [Paper] [Slides] [Video]

If you find this code or idea useful, please consider citing our work:

@inproceedings{yang2020rethinking,
  title={Rethinking the Value of Labels for Improving Class-Imbalanced Learning},
  author={Yang, Yuzhe and Xu, Zhi},
  booktitle={Conference on Neural Information Processing Systems (NeurIPS)},
  year={2020}
}

Overview

In this work, we show theoretically and empirically that, both semi-supervised learning (using unlabeled data) and self-supervised pre-training (first pre-train the model with self-supervision) can substantially improve the performance on imbalanced (long-tailed) datasets, regardless of the imbalanceness on labeled/unlabeled data and the base training techniques.

Semi-Supervised Imbalanced Learning: Using unlabeled data helps to shape clearer class boundaries and results in better class separation, especially for the tail classes. semi

Self-Supervised Imbalanced Learning: Self-supervised pre-training (SSP) helps mitigate the tail classes leakage during testing, which results in better learned boundaries and representations. self

Installation

Prerequisites

Dependencies

  • PyTorch (>= 1.2, tested on 1.4)
  • yaml
  • scikit-learn
  • TensorboardX

Code Overview

Main Files

Main Arguments

  • --dataset: name of chosen long-tailed dataset
  • --imb_factor: imbalance factor (inverse value of imbalance ratio \rho in the paper)
  • --imb_factor_unlabel: imbalance factor for unlabeled data (inverse value of unlabel imbalance ratio \rho_U)
  • --pretrained_model: path to self-supervised pre-trained models
  • --resume: path to resume checkpoint (also for evaluation)

Getting Started

Semi-Supervised Imbalanced Learning

Unlabeled data sourcing

CIFAR-10-LT: CIFAR-10 unlabeled data is prepared following this repo using the 80M TinyImages. In short, a data sourcing model is trained to distinguish CIFAR-10 classes and an "non-CIFAR" class. For each class, images are then ranked based on the prediction confidence, and unlabeled (imbalanced) datasets are constructed accordingly. Use the following link to download the prepared unlabeled data, and place in your data_path:

SVHN-LT: Since its own dataset contains an extra part with 531.1K additional (labeled) samples, they are directly used to simulate the unlabeled dataset.

Note that the class imbalance in unlabeled data is also considered, which is controlled by --imb_factor_unlabel (\rho_U in the paper). See imbalance_cifar.py and imbalance_svhn.py for details.

Semi-supervised learning with pseudo-labeling

To perform pseudo-labeling (self-training), first a base classifier is trained on original imbalanced dataset. With the trained base classifier, pseudo-labels can be generated using

python gen_pseudolabels.py --resume <ckpt-path> --data_dir <data_path> --output_dir <output_path> --output_filename <save_name>

We provide generated pseudo label files for CIFAR-10-LT & SVHN-LT with \rho=50, using base models trained with standard cross-entropy (CE) loss:

To train with unlabeled data, for example, on CIFAR-10-LT with \rho=50 and \rho_U=50

python train_semi.py --dataset cifar10 --imb_factor 0.02 --imb_factor_unlabel 0.02

Self-Supervised Imbalanced Learning

Self-supervised pre-training (SSP)

To perform Rotation SSP on CIFAR-10-LT with \rho=100

python pretrain_rot.py --dataset cifar10 --imb_factor 0.01

To perform MoCo SSP on ImageNet-LT

python pretrain_moco.py --dataset imagenet --data <data_path>

Network training with SSP models

Train on CIFAR-10-LT with \rho=100

python train.py --dataset cifar10 --imb_factor 0.01 --pretrained_model <path_to_ssp_model>

Train on ImageNet-LT / iNaturalist 2018

python -m imagenet_inat.main --cfg <path_to_ssp_config> --model_dir <path_to_ssp_model>

Results and Models

All related data and checkpoints can be found via this link. Individual results and checkpoints are detailed as follows.

Semi-Supervised Imbalanced Learning

CIFAR-10-LT

Model Top-1 Error Download
CE + D_U@5x (\rho=50 and \rho_U=1) 16.79 ResNet-32
CE + D_U@5x (\rho=50 and \rho_U=25) 16.88 ResNet-32
CE + D_U@5x (\rho=50 and \rho_U=50) 18.36 ResNet-32
CE + D_U@5x (\rho=50 and \rho_U=100) 19.94 ResNet-32

SVHN-LT

Model Top-1 Error Download
CE + D_U@5x (\rho=50 and \rho_U=1) 13.07 ResNet-32
CE + D_U@5x (\rho=50 and \rho_U=25) 13.36 ResNet-32
CE + D_U@5x (\rho=50 and \rho_U=50) 13.16 ResNet-32
CE + D_U@5x (\rho=50 and \rho_U=100) 14.54 ResNet-32

Test a pretrained checkpoint

python train_semi.py --dataset cifar10 --resume <ckpt-path> -e

Self-Supervised Imbalanced Learning

CIFAR-10-LT

  • Self-supervised pre-trained models (Rotation)

    Dataset Setting \rho=100 \rho=50 \rho=10
    Download ResNet-32 ResNet-32 ResNet-32
  • Final models (200 epochs)

    Model \rho Top-1 Error Download
    CE(Uniform) + SSP 10 12.28 ResNet-32
    CE(Uniform) + SSP 50 21.80 ResNet-32
    CE(Uniform) + SSP 100 26.50 ResNet-32
    CE(Balanced) + SSP 10 11.57 ResNet-32
    CE(Balanced) + SSP 50 19.60 ResNet-32
    CE(Balanced) + SSP 100 23.47 ResNet-32

CIFAR-100-LT

  • Self-supervised pre-trained models (Rotation)

    Dataset Setting \rho=100 \rho=50 \rho=10
    Download ResNet-32 ResNet-32 ResNet-32
  • Final models (200 epochs)

    Model \rho Top-1 Error Download
    CE(Uniform) + SSP 10 42.93 ResNet-32
    CE(Uniform) + SSP 50 54.96 ResNet-32
    CE(Uniform) + SSP 100 59.60 ResNet-32
    CE(Balanced) + SSP 10 41.94 ResNet-32
    CE(Balanced) + SSP 50 52.91 ResNet-32
    CE(Balanced) + SSP 100 56.94 ResNet-32

ImageNet-LT

  • Self-supervised pre-trained models (MoCo)
    [ResNet-50]

  • Final models (90 epochs)

    Model Top-1 Error Download
    CE(Uniform) + SSP 54.4 ResNet-50
    CE(Balanced) + SSP 52.4 ResNet-50
    cRT + SSP 48.7 ResNet-50

iNaturalist 2018

  • Self-supervised pre-trained models (MoCo)
    [ResNet-50]

  • Final models (90 epochs)

    Model Top-1 Error Download
    CE(Uniform) + SSP 35.6 ResNet-50
    CE(Balanced) + SSP 34.1 ResNet-50
    cRT + SSP 31.9 ResNet-50

Test a pretrained checkpoint

# test on CIFAR-10 / CIFAR-100
python train.py --dataset cifar10 --resume <ckpt-path> -e

# test on ImageNet-LT / iNaturalist 2018
python -m imagenet_inat.main --cfg <path_to_ssp_config> --model_dir <path_to_model> --test

Acknowledgements

This code is partly based on the open-source implementations from the following sources: OpenLongTailRecognition, classifier-balancing, LDAM-DRW, MoCo, and semisup-adv.

Contact

If you have any questions, feel free to contact us through email ([email protected] & [email protected]) or Github issues. Enjoy!

imbalanced-semi-self's People

Contributors

yyzharry avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

imbalanced-semi-self's Issues

Error: No module named 'dataset.resnet_cifar' when running

When I run this command:
python train_semi.py --dataset cifar10 --imb_factor 0.02 --imb_factor_unlabel 0.02

I got this error:
Traceback (most recent call last):
File "train_semi.py", line 15, in
from dataset.imbalance_cifar import SemiSupervisedImbalanceCIFAR10
File "/home/insights-user/imbalanced-semi-self/dataset/init.py", line 1, in
from .resnet_cifar import *
ModuleNotFoundError: No module named 'dataset.resnet_cifar'

Category imbalance

I'm going to do semantic segmentation of deeplabv3+, but I have a problen with Category imbalance. Please help me to solve this problem. Thanks

Why use 5 times more unlabeled data?

I read the paper.
Question about Appendices E3: Effect of Unlabeled Data amount.

The results of CE+Du are 21.75, 20.35, 18.36, and 16.88 about {0.5x, 1x, 5x, 10x}.
The result of 10x is better than 5x more unlabeled data.
But in this paper selected 5 times.

Is there a reason?

moco on cifar dataset

Thanks for the great repo!

I have a quick question, is there any specific reason not adding cifar&svhn datasets to the moco training script? like, it's not suitable or the performance is really bad on the small datasets?

Thanks!

Some problems about the assumption in the papaer.

Hi, I'm very interested in your paper. Especially, the proofs attract me. However, I meet some questions on understanding the proof.

"We assume a properly designed black-box self-supervised task so that the learned representation is Z = k1 ||X||^{2} + k2, where k1, k2 > 0. Precisely, this means that we have access to the new features Zi for the i-th data after the black-box self-supervised step,
without knowing explicitly what the transformation ψ is. "

I'm confused by the following questions:
(1) Why a properly designed black-box self-supervised task can obtain the learned representation, Z = k1 ||X||^{2} + k2 ? whether the moco or rotation-based self-supervised method respect this assumption?

(2) Why the supervised classification task can not obtain the similar representation, Z = k1 ||X||^{2} + k2 ?

Where can I setting the CE(Uniform) and CE(Balanced) ?

I see the Self-supervised pretrained learning (SSP).
There are many models in SSP.

  1. CE(Uniform) + SSP
  2. CE(Balanced) + SSP

Where can I setting the CB in train.py code?
In my opinion, per_cls_weights seems to set a uniform or balance.
Does the CB setting mean 'Reweight' in args.train_rule?

    if args.train_rule == 'Reweight':
        beta = 0.9999
        effective_num = 1.0 - np.power(beta, cls_num_list)
        per_cls_weights = (1.0 - beta) / np.array(effective_num)
        per_cls_weights = per_cls_weights / np.sum(per_cls_weights) * len(cls_num_list)
        per_cls_weights = torch.FloatTensor(per_cls_weights).cuda(args.gpu)
    elif args.train_rule == 'DRW':
        idx = epoch // 160
        betas = [0, 0.9999]
        effective_num = 1.0 - np.power(betas[idx], cls_num_list)
        per_cls_weights = (1.0 - betas[idx]) / np.array(effective_num)
        per_cls_weights = per_cls_weights / np.sum(per_cls_weights) * len(cls_num_list)
        per_cls_weights = torch.FloatTensor(per_cls_weights).cuda(args.gpu)
    else:
        per_cls_weights = None

    if args.loss_type == 'CE':
        criterion = nn.CrossEntropyLoss(weight=per_cls_weights).cuda(args.gpu)
    elif args.loss_type == 'LDAM':
        criterion = LDAMLoss(cls_num_list=cls_num_list, max_m=0.5, s=30, weight=per_cls_weights).cuda(args.gpu)
    elif args.loss_type == 'Focal':
        criterion = FocalLoss(weight=per_cls_weights, gamma=1).cuda(args.gpu)
    else:
        warnings.warn('Loss type is not listed')
        return

Question about the Self-supervised pre-trained models (MoCo)

Thanks for your exellent code! I reproduced the results of CE(uniform) + SSP and cRT+SSP on ImageNet_LT based on the Self-supervised pre-trained models (MoCo), and got the same results as reported in your paper.
But I still have some question about the MoCo SSP checkpoint.
I directly evaluated the performance of MoCo checkpoints + cRT (without CE-uniform supervised training), and the accuracy is 0.118, which is not good. But according to the original paper of MoCo, the accuracy of MoCo on full imagenet should be 0.60+, which is not far from supervised learning.
So is the 0.118 accuracy reasonable? It's much lower than supervised accuracy on ImageNet_LT.

Questions about self-supervised learning on cifar10

Thanks for sharing the codes! This work is really interesting to me. My questions are as follows:

I'm trying to reproduce the results in Table 2. Specifically, I trained the models with/without self-supervised pre-training (SSP). However, the baselines (w.o. SSP) consistently outperform those with SSP under different training rules (including None, Resample, and Reweight). The best precisions are presented below. For each experimental setting, I run twice to see if the results are stable, so there're two numbers per cell.

image

For your reference, I used the following commands:

  • Train Rotation
python pretrain_rot.py --dataset cifar10  --imb_factor 0.01 --arch resnet32
  • Train baseline
python train.py --dataset cifar10 --imb_factor 0.01 --arch resnet32 --train_rule None 
  • Train baseline + SSP
python train.py --dataset cifar10 --imb_factor 0.01 --arch resnet32 --train_rule None --pretrained_model xxx 

About the Proof of Theorem1

At the end of proof, the probability of event E is 1-P1-P2-P3
BUT WHY not the product of three probability which is (1-P1)(1-P2)(1-P3)?

What is the intended learning rate schedule?

def adjust_learning_rate(optimizer, epoch, args):
epoch = epoch + 1
if epoch <= 5:
lr = args.lr * epoch / 5
elif epoch > 160:
lr = args.lr * 0.01
elif epoch > 180:
lr = args.lr * 0.0001
else:
lr = args.lr
for param_group in optimizer.param_groups:
param_group['lr'] = lr

Hi, thanks for sharing your code!

I have a question about the referenced code above.
In the 'adjust_learning_rate' function, the lines 34 and 35 will never be passed.
Can I ask the learning rate schedule that you used for experiments in the paper?

According to the 'adjust_learning_rate' function, the learning rate may change as follows.

epoch lr
0: args.lr * 1 / 5
1: args.lr * 2 / 5
2: args.lr * 3 / 5
3: args.lr * 4 / 5
4: args.lr * 5 / 5
5 ~ 160: args.lr
161~: args.lr * 0.01

How to get the image in the readme

Could you give some hint how to get the image stored in the directory of assets?
I want to get that type of image on my own dataset.
Thanks
tsne_self

Can't achieve the given performance: ResNet-50 + SSP+CE(Uniform) for imageNet-LT

I download the pre-trained model from the given path Resnet-50-rot. And train the model with the given config imagenet_inat/config/ImageNet_LT/feat_uniform.yaml
The training cmd is:
python imb_cls/imagenet_inat/main.py --cfg 'imb_cls/imagenet_inat/config/ImageNet_LT/feat_uniform.yaml' --model_dir workdir/pretrain/moco_ckpt_0200.pth.tar.
I only get 41.1 top-1 accuracy but the given model achieved 45.6 [CE(Uniform) + SSP].

Can you help me check where is the problem?
image
image

训练自己数据集

非常感谢作者作出的贡献,如果能给出训练自己数据集的具体步骤就更好了。

Have you ever tried "Semi-Supervised Imbalanced Learning on ImageNet-LT"?

Hi, have you ever tried "Semi-Supervised Imbalanced Learning" on ImageNet-LT?

According to the experiment result in the paper, the performance with Semi-Supervised Imbalanced Learning seems better than Self-Supervised Imbalanced Learning on CIFAR-10-LT.

If I want to try this experiment, how can I modify the dataset/imagenet.py to dataset/imblance_imagenet.py (similar to imblance_cifar.py)?

About the method

Thank you for sharing your interesting work. Would you mind clarify that what is the method of "CE(Balanced)" ?

What's the required hardware to reproduce the result?

Thanks for sharing this code. It's interesting.
May I know the required hardware to reproduce the result?

The reason I'm asking because I tried to run "pretrain_rot.py --dataset 'cifar10' --imb_factor 0.01 ", but the system doesn't response for a long time when running at "output = model(inputs)".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.