Giter Club home page Giter Club logo

imagenet21k's People

Contributors

florinandrei avatar mrt23 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

imagenet21k's Issues

finetune strategy on 1k data

After using single label to pretrain on 21k data, what is the finetune strategy on 1k data? E.g. MobilenetV3_large_100

How to adjust learning rate for larger batch size

Hi,

Thank you for your great work! I am currently trying to run your code with a larger batch size to better parallelize the code. Could you provide some hints on how to adjust the learning rate with respect to the larger batch size? I noticed that in the DeiT paper, the authors propose to adjust the learning rate with the same scale of batch size, should I also do such scaling?

Thank you for your time!

mAP metric

Hi,
great work, I am curious: Could you please point out the precise definition of the mAP metric used for evaluation of the experiments on the MSCOCO dataset in the paper?

Thanks!

About downstream tasks

In order to verify the effectiveness of ImageNet-21k dataset pretraining. I modified the Faster RCNN configuration file in mmdetection, only point the path of the backbone resnet50 pretraining weights to the 21k trained model but achieved worse results than the torchvision pretraining weights

The appropriate learning rate

@mrT23 Dear sir, when I use the pretrained imagenet21k model and set the lr = 0.01or 0.1 rather than default lr value 0.0003,the Accurancy is very low, such as 0.17% util train dozens of epochs.
if I use sgd optimizer with lr=0.01 or 0.1 , the result will be better than using adam with lr=0.0003 about 0.05% on my own dataset, , how to choose the best optimizer and corresponding lr ? Thank you very much!

Detection and segmentation performance

I'm very interested in this awesome work! Do you try your pretrained model on other tasks like COCO detection or cityscapes segmentation? The performance on the classification tasks is amazing. I wonder whether it can improve other tasks.

About Training Setting Parameters

Could you provide the training parameters when using 8 * V100 with DDP?
When I use the following command line
python3 -u -m torch.distributed.launch --nnodes 1 --node_rank 0 --nproc_per_node 8 --master_port 2221 train_semantic_softmax.py --data_path /data/imagenet22kp_fall/ --model_name mobilenetv3_large_100 --epochs 80 --weight_decay 1e-4 --batch_size 1024 --lr 3e-4 --num_classes 11221 --tree_path ./data/imagenet21k_miil_tree.pth --model_path=./mobilenetv3_large_100.pth
The accuracy is only 71.366%, which is far lower than 73.1% reported in the paper.

the Adam Optimizer

Hi, thanks very much for sharing the code! I have read the code and found that the current Adam optimizer is not the AdamW (true weight decay) you mentioned in Appendix B.1 of your paper. I wonder do I get something wrong here?

How did everyone get their ImageNet-21K?

"ImageNet-21K-P processed dataset, based on ImageNet-21K winter release, is now available for easy downloading via the offical ImageNet site."

No responses or permission granted since last year. Need help

Normalize

Thank you for this awesome work! I want to make sure that you only rescale the input image to 0-1 instead of normalizing them by (img-mean)/std like we usually do in ImageNet training. Is that right?

How long does one epoch take?

I am pretraining from scratch the processed winter version of imagenet21k, 8*V100. 3 hours passed and Epoch 0 has not yet finished. It seems the training has crashed somehow. In the figure, it shows the GPU stats using wandb. I guess this is not normal, right? What would you suggest to change?
截屏2021-07-10 下午5 58 52

Single label pretraining on in21k using ViT

Hi, I have seen that you have updated single label pretraining script on in21k. This is really great work. I have some questions about pretraining ViT:

  1. The default setting is for tresnet_m, do you have the configs for vit-b-16? Or it is actually the same?
  2. What is the accuracy of the validation set in single label pretraining? In the table of your readme file, I see that using semantic loss, vit reaches 77.6% and further finetuning on in1k reaches 84.4%. But what about single label pretrained models?

cheers,

ImageNet 21K-P Class ID:Name mapping

Hi!

Thanks for the great repo!

I'm using timm to load your pretrained models.
The "_in21k" variants (e.g tresnet_m_miil_in21k) output 11221 classes in the final layer,
which seems to be the amount of classes in the Fall 11 Version.

Where can I find the mapping from the model output to the class names/wordnet IDs?

General fail in preprocessing

I did a full preprocessing using your script on the winter21_whole.tar.gz dataset. Is it normal to have tons of general fail messages? Will these fails of images impact final training results?
截屏2021-07-05 下午6 35 15

Line 60 in `train_semantic_softmax.py`

Hi, in train_semantic_softmax.py, line 60 calls model = to_ddp(model, args) for the second time (it already calls the same in line 54.) I think this line should be removed.

processing for winter21-whole

Hello, thanks for your great work.
I am confused about different versions of the dataset. As far as I understand, we here have the following version:

  • fall11_whole.tar, size 1.3T, 21841 classes

  • ImageNet-21K-P based on fall1 release could be obtained using the processing_script.sh, 11221 classes

  • winter21_whole.tar.gz, size 1.1T, 19167 classes

  • ImageNet-21K-P based on winter21 release could be (or not) using the processing_script.sh, 10450 classes

My question is: can I use the processing_script.sh to process winter21_whole.tar.gz to get the so-called variant that has 10450 classes?

Some confuse on processing to link wikidata

Dear author :
Thank you for your awesome idea !
I have two questions :
1、If the label of the parent node does not appear in origin label set, how do you deal with it ?
2、When I use WordNet.hypernyms(), sometimes it will return multi parents for one synsetid, how do you deal with it?
3、How do you set the 0~11 level, if you just use the number of its parents, when a node have two parents, obviously it will get the wrong level, besides that, the labels may have some overlaps cross different levels, how do you get over it?

Looking for you apply

The strategy of transferring ImageNet-21k ViT model to cifar100

Hi @mrT23, thanks for your great work! Currently I use timm train.py to finetune the 'vit_base_patch16_224_miil_in21k' model on cifar100, however I can't get the reported result 94.2%.
Here is my running script.

python -m torch.distributed.launch --nproc_per_node=8 --master_port 6016 train.py \
/data/cifar-100-images/ \
-b=64 \
--img-size=224 \
--epochs=50 \
--color-jitter=0 \
--amp \
--lr=2e-4 \
--sched='cosine' \
--model-ema --model-ema-decay=0.995 --reprob=0.5 --smoothing=0.1 \
--min-lr=1e-8 --warmup-epochs=3 --train-interpolation=bilinear --aa=v0 \
--model=vit_base_patch16_224_miil_in21k \
--pretrained \
--num-classes=100 \
--opt=adamw --weight-decay=1e-4 \
--checkpoint-hist=1

I try several settings:
adam, lr=2e-4, wd=1e-4 92.44%
adamw, lr=2e-4, wd=1e-4 92.90%
sgd, lr=2e-4, wd=1e-4 87.49%
adamw, lr=4e-4, wd=1e-4 92.24%
adamw, lr=2e-4, wd=1e-2 93.08%
Could you give me some suggestions?

data deduplicate

Has the ImageNet1K validation data and Imagenet21K training data been de-duplicated?

Request for ImageNet-1k KD results

Hi, I'm impressed by your great work! I'm wondering would it be possible for you to release the model checkpoints after you trained from the pre-trained ImageNet-21k-P weights by your semantic KD on ImageNet-1k, which are the models in your Table6. I believe it would be more convincing if others could directly test your model on ImageNet-1k.

Train on custom dataset

HotelID is a dataset that has similar hierarchical architecture as Imagenet21k.
https://arxiv.org/abs/2106.05746
So I want to try to use this project to train on that dataset.

The question is in that case, each chain_id has many sub-hotels, this means the parent of the chain_id class is itself when genrating the 'semantic tree'.

Do you think it can give reasonable results?

can't download metadata

i tried to run visualize_detector.py, but there is a httperror

urllib.request.urlretrieve(url, filename)

Exception has occurred: HTTPError
HTTP Error 403: Forbidden

how can it be solved?

Val Result

Dear author :
When I test resnet50 model you provide on imagenet21k-p val dataset, the ImageNet-21K-P semantic top-1 Accuracy is just 69%, but what you have claimed is 75.6%, but I indeed see the improvement on downstream task, what's the problem?

ImageNet-1K Metadata

Hi @mrT23, I want to do the inference on ImageNet 1K but do know where can i download the metadatas. would you please tell me where can i find the metadata for ImageNet1K?

Thanks!

Accuracy on ImageNet21K-P

Hello,

Great work! Thanks for sharing. I was wondering if you can please share the training set accuracy and curves on ImageNet21K-P. In the appendix, I think you have reported the final validation set accuracy. It would be great if you can also share the final accuracy and training curves on the train set. It would help to debug any issue with other models.

Best,
Ankit

The strategy of transferring ImageNet-21k models to ImageNet-1K.

Hi, I try to transfer the ImageNet-21k model (resnet50_miil_21k) to ImageNet-1K according to the details provided in your paper and https://github.com/Alibaba-MIIL/ImageNet21K/blob/main/Transfer_learning.md. But I only get 79.5% top-1 acc on ImageNet-1K, lower than the 82.0% in your readme. Could you give me some suggestions?

My training command is as follows:
python3 -m torch.distributed.launch --nproc_per_node=8 train.py
data/ImageNet-1k/
-b=128
--img-size=224
--epochs=100
--color-jitter=0
--amp
--sched='cosine'
--model-ema --model-ema-decay=0.995 --reprob=0.5 --smoothing=0.1
--min-lr=1e-8 --warmup-epochs=3 --train-interpolation=bilinear --aa=v0
--pretrained
--lr=2e-4
--model=resnet50
--opt=adam --weight-decay=1e-4

Where can I download winter21_imagenet21k_miil_tree.pth

Hi,

Thank you for sharing your great work! I am currently running with the Winter 21 Version of the ImageNet-21k-P dataset. But I cannot find the link to the semantic tree file winter21_imagenet21k_miil_tree.pth you mentioned in this page. Do you know where can I get this file?

Thank you for your time!

Detected call of `lr_scheduler.step()` before `optimizer.step()`

When run training, a warning appear:

UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in 
the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first 
value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate

Since the requirement of this code is torch 1.7.1, I assume the code can be affected by the warning. Should I reorder the call? or just ignore the warning?

data augmantations

Hello, I see that you already have randaugment in you data preprocessing. Why do you in additional have cutoutPIL in transform knowing that randaugment has cutout? Does it have better performance?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.