The cspaf's intro from chenyulu2000

CS-PAF

Credits

This repository is build upon visdial_conv (Agarwal et al.). We express our sincere gratitude to the researchers for providing their code, which has been instrumental in the development of this project.

Environment Configuration

conda create -n cspaf python=3.7
pip install -r requirements.txt

python -c "import nltk; nltk.download('all')"

Data Preparation

Dataset	File	Source
Visdial v1.0	features_faster_rcnn_x101_train.h5	visdial-challenge-starter-pytorch (Das et al.)
	features_faster_rcnn_x101_val.h5
	features_faster_rcnn_x101_test.h5
	visdial_1.0_word_counts_train.json
	glove.npy	visdial-principles(Qi et al.)
	visdial_1.0_train.json	visdial official
	visdial_1.0_val.json
	visdial_1.0_test.json
	visdial_1.0_train_dense_annotations.json
	visdial_1.0_val_dense_annotations.json
VisdialConv	visdial_1.0_val_crowdsourced.json	subsets/visdialconv/(Agarwal et al.)
VisdialConv	visdial_1.0_val_dense_annotations_crowdsourced.json	subsets/visdialconv/(Agarwal et al.)
VisPro	visdial_1.0_val_vispro.json	subsets/vispro/(Agarwal et al.)
VisPro	visdial_1.0_val_dense_annotations_vispro.json	subsets/vispro/(Agarwal et al.)

Train or Finetune

bash -i scripts/cap_hist_early_fusion_disc_train.sh

We use RTX 3090 to train the model, and the batch size per gpu is 12. With a gpu count of 2, we choose a learning rate of 5e-4. The training logs and checkpoints will be saved in directory exps/exp_name.

Evaluate

bash -i scripts/cap_hist_early_fusion_disc_eval.sh

The training logs and checkpoints will be saved in directory exps/exp_name. If you want to get the results generated by EvalAI, you can submit the file exps/exp_name/ranks.json.

Attention Map Visualization (optional)

You can visit the repository Faster-R-CNN-with-model-pretrained-on-Visual-Genome which can generate 2048-d features. If you just want to quickly visualize the results of visdial v1.0, you can also visit the project from our fork version [https://github.com/chenyulu2000/Faster-R-CNN-with-model-pretrained-on-Visual-Genome]. This project has modified some bugs and can generate h5 type files for visdial v1.0 val set, which can be directly used in visual dialog visualization.

python attention_map_vis/extract_questions.py

python attention_map_vis/visualize.py

chenyulu2000 / cspaf Goto Github PK

cspaf's Introduction

CS-PAF

Credits

Environment Configuration

Data Preparation

Train or Finetune

Evaluate

Attention Map Visualization (optional)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent