Giter Club home page Giter Club logo

Comments (13)

Dongshengjiang avatar Dongshengjiang commented on July 27, 2024

have you tried the linear prob eval?

from mae-pytorch.

pengzhiliang avatar pengzhiliang commented on July 27, 2024

Emm, how about end-to-end finetuning?

from mae-pytorch.

Dongshengjiang avatar Dongshengjiang commented on July 27, 2024

I just tried your latest updata of end-to-end finetuning, it seems good. But I think linear prob still is a metric cannot avoided.

from mae-pytorch.

pengzhiliang avatar pengzhiliang commented on July 27, 2024

Thanks for you suggestions, we actually ignore the linear prob metric. In fact, I am not very familiar with Linear Prob. Can you help me try to implement it? Thank you very much!

from mae-pytorch.

Dongshengjiang avatar Dongshengjiang commented on July 27, 2024

https://github.com/facebookresearch/dino/blob/main/eval_linear.py
dino contains the code of knn and linear eval code. I am not sure how to treat the cls token, as the linear prob only finetune the last head, but for MAE , the cls token is not pre-trained.

from mae-pytorch.

pengzhiliang avatar pengzhiliang commented on July 27, 2024

Ok, thank you~

from mae-pytorch.

pengzhiliang avatar pengzhiliang commented on July 27, 2024

Hello, have you finished the end-to-end fine-tuning of vit-base/1600e? Can you tell me the result? Thank you!

from mae-pytorch.

Dongshengjiang avatar Dongshengjiang commented on July 27, 2024

Hi, I finished the epoch 1600 training, but I only got fine-tuning result of 83.15 for epoch 1400 and 82.97 for epoch 1600. which is lower than your reported epoch 400 and the paper results.

from mae-pytorch.

Dongshengjiang avatar Dongshengjiang commented on July 27, 2024

From your pretrained log of vit_base, I found your max learning rate is 0.0024, is you run with 128X32 batch size?
according to the code: args.lr = args.lr * total_batch_size / 256, which should be 0.0006 for batchsize of 128X8.

from mae-pytorch.

pengzhiliang avatar pengzhiliang commented on July 27, 2024

Ok, that is very strange. I run vit-base with 512 x 8 = 4096, where the lr: 1.5e-4 * 512 * 8 / 256 = 0.0024.

from mae-pytorch.

Dongshengjiang avatar Dongshengjiang commented on July 27, 2024

ok, I will try your setting to reimplement your results for epoch 400. But the results of epoch 1600 is on batchsize 4096, still not good enough. the ft accuracy incrase slowly with epoch: 82.71/200, 82.82/400,82.87/600, 83/800,82.78/1000,82.96/1200,83.15/1400,82.97/1600.

from mae-pytorch.

pengzhiliang avatar pengzhiliang commented on July 27, 2024

OK, thank you for your so much experiments!
Maybe there is still some problems, I will check it carefully.

from mae-pytorch.

Harick1 avatar Harick1 commented on July 27, 2024

@Dongshengjiang Have you tried the LinearProbe evaluation with cls token?

The paper said: As ViT has a class token [16], to adapt to this design, in our MAE pre-training we append an auxiliary dummy token to the encoder input. This token will be treated as the class token for training the classifier in linear probing and fine-tuning.

It seems that the author just adds a dummy token when pre-training, and directly uses it as the feature for linear probing.

from mae-pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.