alibaba-miil / imagenet21k Goto Github PK

View Code? Open in Web Editor NEW

709.0 11.0 71.0 1.41 MB

Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper

License: MIT License

Shell 3.36% Python 96.64%

imagenet21k pretraining downstream semantic-softmax single-label multi-label-classification vision-transformer mixer

imagenet21k's People

Contributors

Stargazers

Watchers

imagenet21k's Issues

could I transferred the paper's idea to object detection task?

Recently, do an object fine-grained detection task, is it convenient to transferred to object detection?
thanks

The question about the relative number of occurrences of hierarchy k in the loss function

According to the paper, Ok represents the number of occurrences of hierarchy k, But I feel like there's something wrong with this formula, I wonder if that's the way it should be:

K is total hierarchy

I don't know if I misunderstood, can you elaborate on it? Thank you

finetune strategy on 1k data

After using single label to pretrain on 21k data, what is the finetune strategy on 1k data? E.g. MobilenetV3_large_100

How to adjust learning rate for larger batch size

Hi,

Thank you for your great work! I am currently trying to run your code with a larger batch size to better parallelize the code. Could you provide some hints on how to adjust the learning rate with respect to the larger batch size? I noticed that in the DeiT paper, the authors propose to adjust the learning rate with the same scale of batch size, should I also do such scaling?

Thank you for your time!

train has high ```top1.avg``` but predict has bad result, same data to predict,why?

i use vit_base_patch16_224_in21k model, when i train with pretrained model,it has high top1.avg .about 99 %.
but when i use same data(train data) to predict ,the resutl very bad!!,i dont know why,can you give me some idea?

mAP metric

Hi,
great work, I am curious: Could you please point out the precise definition of the mAP metric used for evaluation of the experiments on the MSCOCO dataset in the paper?

Thanks!

如何用resnet50加载21k pertained

imagenet1k pretrain model

ImageNet21K/src_files/models/utils/factory.py

Line 54 in 00ef998

if args.model_path: # make sure to load pretrained ImageNet-1K model

Why do we need to load pretrained ImageNet-1K model if we are going to pretrain based on the imagenet21k dataset?

About downstream tasks

In order to verify the effectiveness of ImageNet-21k dataset pretraining. I modified the Faster RCNN configuration file in mmdetection, only point the path of the backbone resnet50 pretraining weights to the 21k trained model but achieved worse results than the torchvision pretraining weights

The appropriate learning rate

@mrT23 Dear sir, when I use the pretrained imagenet21k model and set the lr = 0.01or 0.1 rather than default lr value 0.0003,the Accurancy is very low, such as 0.17% util train dozens of epochs.
if I use sgd optimizer with lr=0.01 or 0.1 , the result will be better than using adam with lr=0.0003 about 0.05% on my own dataset, , how to choose the best optimizer and corresponding lr ? Thank you very much!

For a customized dataset, how to build a semantic tree and how to match them with the label defined by pytorch ImageFolder?

after reading all the issues, I still don’t understand how to build the tree...

About Multi-label Training Scheme

I want to know how to get multi-label information of each image. It doesn't contain these information in winter21_whole.tar.gz.

Detection and segmentation performance

I'm very interested in this awesome work! Do you try your pretrained model on other tasks like COCO detection or cityscapes segmentation? The performance on the classification tasks is amazing. I wonder whether it can improve other tasks.

About Training Setting Parameters

Could you provide the training parameters when using 8 * V100 with DDP?
When I use the following command line
python3 -u -m torch.distributed.launch --nnodes 1 --node_rank 0 --nproc_per_node 8 --master_port 2221 train_semantic_softmax.py --data_path /data/imagenet22kp_fall/ --model_name mobilenetv3_large_100 --epochs 80 --weight_decay 1e-4 --batch_size 1024 --lr 3e-4 --num_classes 11221 --tree_path ./data/imagenet21k_miil_tree.pth --model_path=./mobilenetv3_large_100.pth
The accuracy is only 71.366%, which is far lower than 73.1% reported in the paper.

the Adam Optimizer

Hi, thanks very much for sharing the code! I have read the code and found that the current Adam optimizer is not the AdamW (true weight decay) you mentioned in Appendix B.1 of your paper. I wonder do I get something wrong here?

How did everyone get their ImageNet-21K?

"ImageNet-21K-P processed dataset, based on ImageNet-21K winter release, is now available for easy downloading via the offical ImageNet site."

No responses or permission granted since last year. Need help

Normalize

Thank you for this awesome work! I want to make sure that you only rescale the input image to 0-1 instead of normalizing them by (img-mean)/std like we usually do in ImageNet training. Is that right?

How long does one epoch take?

I am pretraining from scratch the processed winter version of imagenet21k, 8*V100. 3 hours passed and Epoch 0 has not yet finished. It seems the training has crashed somehow. In the figure, it shows the GPU stats using wandb. I guess this is not normal, right? What would you suggest to change?

Single label pretraining on in21k using ViT

Hi, I have seen that you have updated single label pretraining script on in21k. This is really great work. I have some questions about pretraining ViT:

The default setting is for tresnet_m, do you have the configs for vit-b-16? Or it is actually the same?
What is the accuracy of the validation set in single label pretraining? In the table of your readme file, I see that using semantic loss, vit reaches 77.6% and further finetuning on in1k reaches 84.4%. But what about single label pretrained models?

cheers,

casting error in AccuracySemanticSoftmaxMet

ImageNet21K/src_files/semantic/metrics.py

Line 41 in 7cb9e67

num_valids_total = reduce_tensor(num_valids_total, num_distrib())

ImageNet21K/src_files/helper_functions/distributed.py

Line 42 in 7cb9e67

rt /= n

I have got a casting error because of these two lines of code. num_valids_total is int.

rt /= n
RuntimeError: result type Float can't be cast to the desired output type Long

ImageNet 21K-P Class ID:Name mapping

Hi!

Thanks for the great repo!

I'm using timm to load your pretrained models.
The "_in21k" variants (e.g tresnet_m_miil_in21k) output 11221 classes in the final layer,
which seems to be the amount of classes in the Fall 11 Version.

Where can I find the mapping from the model output to the class names/wordnet IDs?

General fail in preprocessing

I did a full preprocessing using your script on the winter21_whole.tar.gz dataset. Is it normal to have tons of general fail messages? Will these fails of images impact final training results?

Line 60 in `train_semantic_softmax.py`

Hi, in train_semantic_softmax.py, line 60 calls model = to_ddp(model, args) for the second time (it already calls the same in line 54.) I think this line should be removed.

Which version is imagenet21k_resized.tar.gz ?

Model weights trained on Stanford cars

Hi,

Can you please share the link to the weights file of model trained on Stanford cars data set. I am unable to get the expected results using https://miil-public-eu.oss-eu-central-1.aliyuncs.com/model-zoo/ImageNet_21K_P/models/tresnet_l_v2_miil_21k.pth

Please guide.

Thanks,

pretrain head of vit

In the vit paper, it says:

The classification head is implemented by a MLP with one hidden layer at pre-training
time and by a single linear layer at fine-tuning time

So if you are using timm package, they define the head like this:
https://github.com/rwightman/pytorch-image-models/blob/d3f744065088ca9b6b3a0f968c70e90ed37de75b/timm/models/vision_transformer.py#L293

Did you reach the stats in your paper using single layer head or head of one hidden layer?

how to use DPP to start ```train_single_label.py```

1 machine 2 GPUs
I cant use command line to start
how to start in the wInternal code to use DDP

i cant find where to save mode, how to save in the DDP

processing for winter21-whole

Hello, thanks for your great work.
I am confused about different versions of the dataset. As far as I understand, we here have the following version:

fall11_whole.tar, size 1.3T, 21841 classes
ImageNet-21K-P based on fall1 release could be obtained using the processing_script.sh, 11221 classes
winter21_whole.tar.gz, size 1.1T, 19167 classes
ImageNet-21K-P based on winter21 release could be (or not) using the processing_script.sh, 10450 classes

My question is: can I use the processing_script.sh to process winter21_whole.tar.gz to get the so-called variant that has 10450 classes?

Could you provide with commonly used pretrained models with se modules?

As it is evident SE module can dramatically improve the accuracy of network such as ResNet, coule you please also provide some pretrained models?

Just like SE-ResNet101/SE-ResNet50... in this page.

https://github.com/Cadene/pretrained-models.pytorch#torchvision

ResNet50 model fine-tuned on the ILSVRC-2012 dataset

Could you provide the ResNet50 model (pretrained on 21K) fine-tuned on the ILSVRC-2012 dataset?

Some confuse on processing to link wikidata

Dear author ：
Thank you for your awesome idea ！
I have two questions :
1、If the label of the parent node does not appear in origin label set, how do you deal with it ?
2、When I use WordNet.hypernyms(), sometimes it will return multi parents for one synsetid, how do you deal with it?
3、How do you set the 0~11 level, if you just use the number of its parents, when a node have two parents, obviously it will get the wrong level, besides that, the labels may have some overlaps cross different levels, how do you get over it?

Looking for you apply

labeling and classes of images

I have seen your code very nice. I want to ask that how to get the labeling and classes for the images / folders?

The strategy of transferring ImageNet-21k ViT model to cifar100

Hi @mrT23, thanks for your great work! Currently I use timm train.py to finetune the 'vit_base_patch16_224_miil_in21k' model on cifar100, however I can't get the reported result 94.2%.
Here is my running script.

python -m torch.distributed.launch --nproc_per_node=8 --master_port 6016 train.py \
/data/cifar-100-images/ \
-b=64 \
--img-size=224 \
--epochs=50 \
--color-jitter=0 \
--amp \
--lr=2e-4 \
--sched='cosine' \
--model-ema --model-ema-decay=0.995 --reprob=0.5 --smoothing=0.1 \
--min-lr=1e-8 --warmup-epochs=3 --train-interpolation=bilinear --aa=v0 \
--model=vit_base_patch16_224_miil_in21k \
--pretrained \
--num-classes=100 \
--opt=adamw --weight-decay=1e-4 \
--checkpoint-hist=1

I try several settings:
adam, lr=2e-4, wd=1e-4 92.44%
adamw, lr=2e-4, wd=1e-4 92.90%
sgd, lr=2e-4, wd=1e-4 87.49%
adamw, lr=4e-4, wd=1e-4 92.24%
adamw, lr=2e-4, wd=1e-2 93.08%
Could you give me some suggestions?

data deduplicate

Has the ImageNet1K validation data and Imagenet21K training data been de-duplicated?

Timm doesn't have vit_base_patch16_224_in21k_miil

The latest version timm can't find "vit_base_patch16_224_in21k_miil"

How you create the semantic Tree?

Request for ImageNet-1k KD results

Hi, I'm impressed by your great work! I'm wondering would it be possible for you to release the model checkpoints after you trained from the pre-trained ImageNet-21k-P weights by your semantic KD on ImageNet-1k, which are the models in your Table6. I believe it would be more convincing if others could directly test your model on ImageNet-1k.

why no data normlization in data pre-processing？when I use data normlization in data pre-processing， rate of convergence of the network is slow

Train on custom dataset

HotelID is a dataset that has similar hierarchical architecture as Imagenet21k.
https://arxiv.org/abs/2106.05746
So I want to try to use this project to train on that dataset.

The question is in that case, each chain_id has many sub-hotels, this means the parent of the chain_id class is itself when genrating the 'semantic tree'.

Do you think it can give reasonable results?

I want to speed up the tresnet_m go a step further by modifying the structure

I want to speed up the tresnet_m go a step further, if I modify the bottlebneck strcture"[3,4,11,3]" to "[3,4,8,3]"，the network can work too. Do you compare the modified "[3,4,8,3]" or "[3,4,6,3]" to standard resnet50"[3,4,6,3]" about accurancy and speed? Thank you a lot

cant run visualize_detector.py

ImageNet21K/visualize_detector.py

Line 25 in 72c822a

model = timm.create_model('vit_base_patch16_224_miil_in21k', pretrained=True)

error:

Unknown model (vit_base_patch16_224_miil_in21k)

how to run with my own wight

can't download metadata

i tried to run visualize_detector.py, but there is a httperror

urllib.request.urlretrieve(url, filename)

Exception has occurred: HTTPError
HTTP Error 403: Forbidden

how can it be solved?

Val Result

Dear author ：
When I test resnet50 model you provide on imagenet21k-p val dataset, the ImageNet-21K-P semantic top-1 Accuracy is just 69%, but what you have claimed is 75.6%, but I indeed see the improvement on downstream task, what's the problem?

ImageNet-1K Metadata

Hi @mrT23, I want to do the inference on ImageNet 1K but do know where can i download the metadatas. would you please tell me where can i find the metadata for ImageNet1K?

Thanks!

Accuracy on ImageNet21K-P

Hello,

Great work! Thanks for sharing. I was wondering if you can please share the training set accuracy and curves on ImageNet21K-P. In the appendix, I think you have reported the final validation set accuracy. It would be great if you can also share the final accuracy and training curves on the train set. It would help to debug any issue with other models.

Best,
Ankit

The strategy of transferring ImageNet-21k models to ImageNet-1K.

Hi, I try to transfer the ImageNet-21k model (resnet50_miil_21k) to ImageNet-1K according to the details provided in your paper and https://github.com/Alibaba-MIIL/ImageNet21K/blob/main/Transfer_learning.md. But I only get 79.5% top-1 acc on ImageNet-1K, lower than the 82.0% in your readme. Could you give me some suggestions?

My training command is as follows:
python3 -m torch.distributed.launch --nproc_per_node=8 train.py
data/ImageNet-1k/
-b=128
--img-size=224
--epochs=100
--color-jitter=0
--amp
--sched='cosine'
--model-ema --model-ema-decay=0.995 --reprob=0.5 --smoothing=0.1
--min-lr=1e-8 --warmup-epochs=3 --train-interpolation=bilinear --aa=v0
--pretrained
--lr=2e-4
--model=resnet50
--opt=adam --weight-decay=1e-4

Where can I download winter21_imagenet21k_miil_tree.pth

Hi,

Thank you for sharing your great work! I am currently running with the Winter 21 Version of the ImageNet-21k-P dataset. But I cannot find the link to the semantic tree file winter21_imagenet21k_miil_tree.pth you mentioned in this page. Do you know where can I get this file?

Thank you for your time!

How to download semantic tree: winter21_imagenet21k_tree.pth?

I can't find anywhere semantic tree for the winter_21 version with 10450 classes.
I don't think there's any working link provided in this repo or in image-net.org
Could you point me to the file?

Thank you

Detected call of `lr_scheduler.step()` before `optimizer.step()`

When run training, a warning appear:

UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in 
the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first 
value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate

Since the requirement of this code is torch 1.7.1, I assume the code can be affected by the warning. Should I reorder the call? or just ignore the warning?

data augmantations

Hello, I see that you already have randaugment in you data preprocessing. Why do you in additional have cutoutPIL in transform knowing that randaugment has cutout? Does it have better performance?

alibaba-miil / imagenet21k Goto Github PK

imagenet21k's People

Contributors

Stargazers

Watchers

Forkers

imagenet21k's Issues

Recommend Projects

Recommend Topics

Recommend Org