mingkai-zheng / ressl Goto Github PK

ReSSL: Relational Self-Supervised Learning with Weak Augmentation

Python 98.49% Shell 1.51%

self-supervised-learning unsupervised-learning machine-learning deep-learning

ressl's Introduction

ReSSL: Relational Self-Supervised Learning with Weak Augmentation (NeurIPS 2021)

This repository contains PyTorch evaluation code, training code and pretrained models for ReSSL.

For details see ReSSL: Relational Self-Supervised Learning with Weak Augmentation by Mingkai Zheng, Shan You, Fei Wang, Chen Qian, Changshui Zhang, Xiaogang Wang and Chang Xu

Cifar10 / STL10

This repository is based on ImageNet dataset, We also provide the training code and pretrained model for cifar10/100, STL10 and TinyImageNet, please download it from this link.

Reproducing

To run the code, you probably need to change the Dataset setting (dataset/imagenet.py), and Pytorch DDP setting (util/dist_init.py) for your own server enviroments.

The distribued training of this code is base on slurm enviroments, we have provide the training scrips under the script folder.

We also provide the pretrained model for ResNet50 (single crop and 5 crops)

	Arch	BatchSize	Epochs	Crops	Linear Eval	Download
ReSSL	ResNet50	256	200	1	69.9 %	ressl-200.pth
ReSSL	ResNet50	256	200	5	74.7 %	ressl-multi-200.pth

If you want to test the pretained model, please download the weights from the link above, and move it to the checkpoints folder (create one if you don't have .checkpoints/ directory). The evaluation scripts also has been provided in script/train.sh

Citation

If you find that ReSSL interesting and help your research, please consider citing it:


@inproceedings{
      zheng2021ressl,
      title={Re{SSL}: Relational Self-Supervised Learning with Weak Augmentation},
      author={Mingkai Zheng and Shan You and Fei Wang and Chen Qian and Changshui Zhang and Xiaogang Wang and Chang Xu},
      booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
      year={2021},
      url={https://openreview.net/forum?id=ErivP29kYnx}
}

ressl's People

Contributors

Stargazers

Watchers

Forkers

ressl's Issues

Momentum for imagenet

Hi!

In the paper I found no reference for the momentum parameter when training on Imagenet.
However, I noticed 0.999 is the default in your code.

Did you use 0.999 momentum on Imagenet?

Thanks.

Weight decay and Resnet18

Hi!

In my last issue, I forgot to congratulate to your exceptional paper. I was also looking for a relational method since PAWS, but couldn't really find one that could achieve such high performance on imagenet. Also, this method works with small batch size and very low computational resources due to the frozen target network and single-view backprop. Nice work!

Reading the code I noticed two minor differences to the paper though. Can you please double-check these and clarify which one reflects the results published?

(1) Weight decay
- Paper: I didn't find any mentions on weight decay when training on Imagenet, but found 5e-4 for small and medium datasets.
- Code: link You use 1e-4 weight decay and 0 for bias. Is it the default Imagenet settings?
(2) Resnet18 7x7 conv
- Paper:
- We adopt the ResNet18 [25] as our backbone network. Because most of our dataset contains low-resolution images, we replace the first 7x7 Conv of stride 2 with 3x3 Conv of stride 1 and remove the first max pooling operation for a small dataset.
- Code: link I see no sign of these changes, it looks you kept the original imagenet-resnet setup. Didn't you?

Thank you.

PS.: I am about to reproduce your results from the paper, but currently hanging around 65% on Imagenet.

Not align with "dim: feature dimension (default:128)", "K: queue size; number of negative keys (default: 65536)"

https://github.com/KyleZheng1997/ReSSL/blob/3e6644d6790b2fa1b68bf527d2fa15fbc7ad9412/network/ressl.py#L12

关于代码运行方面的一些问题

你好，我只有一块GPU，我应该输入什么语句去跑你的代码？我发现有一个ressl_multi.py和一个ressl.py文件，请问这两个有什么区别吗？是一个为单GPU准备的，一个为多GPU准备的代码吗？

KNN evaluation

Hey,
Great paper!
By the design of the loss and architecture , I would guess the learned representation should be good or even SOTA in KNN evaluation (that was also showed In DINO paper).
I did not see such results on imagenet in the paper, did you try it? And if yes, can you share your results?
Thanks!

When will you give the code?Wait to admire

Pretrain on CIFAR-10, CIFAR-100, STL-10, Tiny ImageNet

Thank you for your great work! I notice that results on small and medium datasets (i.e. CIFAR-10, CIFAR-100, STL-10, Tiny ImageNet) are provided in your paper. Can you provide pretraining configs on these datasets?

Information

Which journal have you published this paper ?

What is the final loss size?

Hi @KyleZheng1997,

Thanks for your contribution, could you provide the pre-training log? I really want to know the pre-train loss when training is over.

CIFAR100 accuracy

After downloading your tiny-ressl implementation and evaluating your pretrained cifar100 network, I get 66.7%, while you reported 63.8% in the paper. When you have the time, can you please double-check which percentage is right? Thanks.

Implementation details about linear evaluation

Thanks for your great work and code sharing!
I downloaded the code and logs for small-sized dataset but can't reproduce the linear evaluation accuracy in table 1 following page 6 settings, can you provide some detailed hyperparameter settings? If I'm not wrong, we use 100 epochs linear eval after 200 epochs pertaining?

Pretrained models for CIFAR10, 100 and STL10

Hi @KyleZheng1997 , Thanks for the nice work and code. Would it be possible to provide the pretrained models for CIFAR-10, 100 and STL-10. Thanks in Advance.