Comments (4)
Hi, I fine-tuned it in a larger batch size,(e,g in 32 V100 or 64 A100). You can try a larger gradient_accumulation_steps.
from pevl.
ok, i will try it. thanks
from pevl.
hi, on this weekend i try to run in 16 2080ti with batch size=1 and gradient_accumulation_steps=32.
acc after epoch 0 is 66.7%, epoch 1 is 66.3%.
As the training time is so long and there is no increase after the first two epochs, i want to know whether i should still run, in other words, whether there is something wrong.
or can we frozen some parameters of the model to maintain its performance and speedup the finetune training?
from pevl.
i evaluate the pevl_vcr_finetune.pth and pevl_vcr_ssp.pth, which are provided, on the val set.
Acc of two checkpoints are 74.1/74.8 and 74.2/74.8.
Is it true? There is no improvement and a gap if comparing it with the result in the paper.
from pevl.
Related Issues (17)
- token position HOT 1
- Query regarding downstream task HOT 1
- Query Regarding Implementation HOT 1
- code and checkpoints for VQA HOT 7
- Code for ALBEF's VCR HOT 5
- Finetuning for VCR HOT 2
- train for vcr HOT 1
- checkpoint for VCR HOT 1
- VCR download link failed HOT 1
- runtime error when i run run_vcr_train.py HOT 2
- TypeError: add_code_sample_docstrings() got an unexpected keyword argument 'tokenizer_class'
- what is the difference of the second pre-trained model for different tasks?
- Visual Relation Detection Reproducibility HOT 2
- Reproducing the pretrain
- Reproducing the Phrase Grounding task
- Checkpoint download speed is very slow (1KB/s) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pevl.