Giter Club home page Giter Club logo

Comments (4)

qyc-98 avatar qyc-98 commented on August 24, 2024

Hi, I fine-tuned it in a larger batch size,(e,g in 32 V100 or 64 A100). You can try a larger gradient_accumulation_steps.

from pevl.

huangsiyong avatar huangsiyong commented on August 24, 2024

ok, i will try it. thanks

from pevl.

huangsiyong avatar huangsiyong commented on August 24, 2024

hi, on this weekend i try to run in 16 2080ti with batch size=1 and gradient_accumulation_steps=32.
acc after epoch 0 is 66.7%, epoch 1 is 66.3%.
As the training time is so long and there is no increase after the first two epochs, i want to know whether i should still run, in other words, whether there is something wrong.
or can we frozen some parameters of the model to maintain its performance and speedup the finetune training?

from pevl.

huangsiyong avatar huangsiyong commented on August 24, 2024

i evaluate the pevl_vcr_finetune.pth and pevl_vcr_ssp.pth, which are provided, on the val set.
Acc of two checkpoints are 74.1/74.8 and 74.2/74.8.
Is it true? There is no improvement and a gap if comparing it with the result in the paper.

from pevl.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.