Comments (10)
My server is made of 4 * GTX1070 (8GB each) + 64 GB RAM + 2 * Intel Xeon E5-2620V4 (2.1 GHz) + 1TB SSD PCIe + 1TB SSD SATA.
However, to be able to train one model with Attention with a tiny data loading time, you will need one GPU pytorch compatible, 3 threads and 1 * SSD SATA of 500GB devoted to storing data and only data (WARNING: not OS).
With a Pascal GPU on VQA 1.0 (VQA 2.0 will be added soon - it has twice the number of question/answer but same images):
- on train/val:
- with no att model: 5 hours
- with att model: 10 hours
- on trainval/testdev:
- with no att model: 10 hours
- with att model: 20 hours
- on trainval+visual_genome/testdev (visual genome will be added soon):
- with no att model: 1 day
- with att model: 2 days
from vqa.pytorch.
@ahmedmagdiosman I tried to load data from the SSD I use as boot drive, and got high data loading time when training models with Attention (for instance the OS may sometimes write on it or does some blocking stuff). In fact, you need to load data of dim: (batch_size x 2048 x 14 x 14) which is really big. However, the models without Attention (NoAtt) necessitate to load data of dim: (batch_size x 2048). So it's ok :)
An SSD is not required to run the models with Attention, but if you have high data loading time, be sure to use monitoring tools such as atop
or htop
to locate the bottleneck.
Lastly I suspect that h5py/HDF5 is not well suited for this kind of read intensive tasks. In fact, it seemed to work better in my old Torch7 code with torchnet.IndexedDataset. If I had the time, I would compare h5py/HDF5 and LMDB
from vqa.pytorch.
In my experience, the best is to use pretrained caffe model, eg use MCB code and store the tensors as compressed numpy arrays. In this case it only takes 19gb for the whole train set so you can cache it in RAM.
For torch example see github.com/ilija139/vqa-soft
from vqa.pytorch.
The VQA dataset is the most widely used dataset by the visual question answering community, as it has the largest volume of human-annotated open-ended answers. Other datasets such as DAQAR, COCO-QA or Visual 7W are limited in terms of size and annotation quality. These limitations make them less relevant for evaluation of multimodal fusion models than the VQA dataset. And we do not provide an implementation for those datasets (feel free to contact us, if you need those datasets).
VQA 1.0 is made of several splits: train, val, test-std (including test-dev)
The biggest models are trained on the train + val splits as training set and the test-dev split is used for validation (on the evaluation server).
Thus for study purpose, the smallest datasets provided in this repo are the train split as trainset and val split as valset.
You can train/eval a model on this split using the trainsplit: train
option. See mutan_noatt_train.yaml
from vqa.pytorch.
would you like share your hardware system configuration like memory/GPU, and how many time you take on training, thanks.
from vqa.pytorch.
I guess the training time excludes the process of generating the features of Resnet-152
from vqa.pytorch.
Generating the train features takes 30 mn.
By the way the features used in our paper are available https://github.com/Cadene/vqa.pytorch#features
from vqa.pytorch.
I under the impression that VQA task always take weeks training.
OK, I will try it later.
from vqa.pytorch.
Hey,
why is the SSD required to only store data?
Thanks for providing your code!
from vqa.pytorch.
@Cadene Thanks a lot!
I actually had some really slow loading time with the Torch7 code from the MLB paper, I suspect it's also the HDF5 format since I didn't have a problem loading *.npy files in pycaffe in the original MCB code.
from vqa.pytorch.
Related Issues (20)
- Reproduce the problem using fusion scheme of concatenation HOT 2
- Hi cadene,why i get this error when i use demo? HOT 1
- iteration over a 0-d tensor when trying to run the demo HOT 7
- Hi, Cadene, why don't you update the feature extractor when training? HOT 1
- No learning rate decay? HOT 1
- 'OpenEnded_mscoco_val2014_model_accuracy.json' is not created HOT 2
- Left padding when encoding questions
- Batch first in LSTM
- Why don't you apply a softmax function before the final prediction? HOT 2
- The Link doesn't work: MutanNoAtt trained on the VQA 1.0 trainset HOT 1
- Results on test split
- LSTM avaliable?
- How to process the multiple choice answer HOT 1
- Feature extraction: CUDA out of memory HOT 1
- THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1513368888240/work/torch/lib/THC/generic/THCStorage.c line=82 error=2 : out of memory
- FileNotFoundError: [Errno 2] No such file or directory: 'data/vqa2/raw/annotations/mscoco_train2014_annotations.json' HOT 1
- How did you create the attention maps in your readme? HOT 7
- sysntax error in line 80 HOT 1
- What should be the size of the input_question in engine.py ? HOT 2
- default vqa2/mutan_att_train example (also for demo) is messed
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vqa.pytorch.