Giter Club home page Giter Club logo

miccai19-medvqa's Introduction

!!! Check out our new paper and model improved for Meta-learning Medical Visual Question Answering.

Mixture of Enhanced Visual Features (MEVF)

This repository is the implementation of MEVF for the visual question answering task in medical domain. Our model achieved 43.9 for open-ended and 75.1 for close-end on VQA-RAD dataset. For the detail, please refer to link.

This repository is based on and inspired by @Jin-Hwa Kim's work. We sincerely thank for their sharing of the codes.

Overview of bilinear attention networks

Prerequisites

Please install dependence package by run following command:

pip install -r requirements.txt

Preprocessing

All data should be downloaded via link. The downloaded file should be extracted to data_RAD/ directory.

Training

Train MEVF model with Stacked Attention Network

$ python3 main.py --model SAN --use_RAD --RAD_dir data_RAD --maml --autoencoder --output saved_models/SAN_MEVF

Train MEVF model with Bilinear Attention Network

$ python3 main.py --model BAN --use_RAD --RAD_dir data_RAD --maml --autoencoder --output saved_models/BAN_MEVF

The training scores will be printed every epoch.

SAN+proposal BAN+proposal
Open-ended 40.7 43.9
Close-ended 74.1 75.1

Pretrained models and Testing

In this repo, we include the pre-trained weight of MAML and CDAE which are used for initializing the feature extraction modules.

The MAML model data_RAD/pretrained_maml.weights is trained by using official source code link.

The CDAE model data_RAD/pretrained_ae.pth is trained by code provided in train_cdae.py. For reproducing the pretrained model, please check the instruction provided in that file.

We also provide the pretrained models reported as the best single model in the paper.

For SAN_MEVF pretrained model. Please download the link and move to saved_models/SAN_MEVF/. The trained SAN_MEVF model can be tested in VQA-RAD test set via:

$ python3 test.py --model SAN --use_RAD --RAD_dir data_RAD --maml --autoencoder --input saved_models/SAN_MEVF --epoch 19 --output results/SAN_MEVF

For BAN_MEVF pretrained model. Please download the link and move to saved_models/BAN_MEVF/. The trained BAN_MEVF model can be tested in VQA-RAD test set via:

$ python3 test.py --model BAN --use_RAD --RAD_dir data_RAD --maml --autoencoder --input saved_models/BAN_MEVF --epoch 19 --output results/BAN_MEVF

The result json file can be found in the directory results/.

Citation

Please cite these papers in your publications if it helps your research:

@inproceedings{aioz_mevf_miccai19,
  author={Binh D. Nguyen, Thanh-Toan Do, Binh X. Nguyen, Tuong Do, Erman Tjiputra, Quang D. Tran},
  title={Overcoming Data Limitation in Medical Visual Question Answering},
  booktitle = {MICCAI},
  year={2019}
}

If you find that our meta-learning work for MedVQA is useful, you could cite the following paper:

@inproceedings{aioz_mmq_miccai21,
  author={Tuong Do and Binh X. Nguyen and Erman Tjiputra and Minh Tran and Quang D. Tran and Anh Nguyen},
  title={Multiple Meta-model Quantifying for Medical Visual Question Answering},
  booktitle = {MICCAI},
  year={2021}
}

License

MIT License

More information

AIOZ AI Homepage: https://ai.aioz.io

AIOZ Network: https://aioz.network

miccai19-medvqa's People

Contributors

gazeal avatar quangduytran avatar xuanbinh-nguyen96 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

miccai19-medvqa's Issues

Non-deterministic results in different runs

Thanks for publishing the code-base.

I noticed that in different runs, I get different results. It seems to be because of pytorch itself and setting torch.backends.cudnn.benchmark = False gives deterministic results in different runs. However, the performance drops by a few percentage.
I was wondering whether you had noticed this and maybe report the average or max of different runs in the paper? If that is the case, how many time you ran the experiments?

Thanks in advance.

code for generating the files in data_RAD

Hello, I am interested in VQA in medical field and your work is amazaing! But I am confused that how to generate the files such as trainset.json, testset.json images84x84, images128x128 in data_RAD directory with the original data. Could you plz share the code for this? Thank you!

not able to understand MAML in code files

As per mentioned in paper, for MAML: 5 tasks are considered for 1 iteration. Each task contains 3 classes out of 9 subcategories and 6 images (3 for train and 3 for validation). However inside the code sections, I am not finding this part. The validation set is not seen in the code, as only training is carried out. Also the meta update, train parts are not understandable. Can you please help to know in which file these things are considered and also a high level briefing on the code sections for MAML (meta learning).

problem

Hello, excuse me, I reproduced it with the provided code, found that the accuracy was a little worse than the paper, and then I evaluated it with the provided weight file, found that the accuracy was even worse, and wanted to ask if this was caused by the use of different data sets, did you use the data set to make some fine-tuning? Thank you and bother.

FileNotFoundError: [Errno 2] No such file or directory: 'data_RAD/imgid2idx.json'

I downloaded the dataset from the link, then split the json file into trainset.json and test.json, but still can not find imgid2idx.json. When I run
python3 main.py --model SAN --use_RAD --RAD_dir data_RAD --maml --autoencoder --output saved_models/SAN_MEVF
FileNotFoundError: [Errno 2] No such file or directory: 'data_RAD/imgid2idx.json'

Could you please show where I can get imgid2idx.json?

Request for the splited data for training MAML

Hello,
The paper describes that all the images are categorized into 9 classes: head normal, head abnormal present, head abnormal organ, chest normal, chest abnormal organ, chest abnormal present,abdominal normal, abdominal abnormal organ, and abdominal abnormal present. Could you please provide the split data?
Thanks!

How to get pretrained MAML weights

Hello,
Thank you for your kind sharing. I'm trying to reproduce this paper but having some troubles.
In base_model.py line 164, the code maml_v_emb.load_state_dict(torch.load(weight_path)) loads pretrained weights for MAML, but I can't find the corresponding .pth file.
I tried to load pretrained_maml.weights, but the weights from this file are inconsistent with the structure of MAML model.
Here are my questions:

  • How can I obtain the .pth file to initialize MAML correctly, or am I expected to train it from scratch?
  • Since pretrained_maml.weights doesn't seem to be used to initialize MAML, what's the use of it?

Thanks a lot.

Get the model output based on a given image

Hi, thanks for providing the autoencoder model and the checkpoints of two models. I'm wondering whether there is a quick way to encode the given new image by the pretrained CDAE, feed into the BAN model and get the results based on the original question set. Just want to have a play with this :)

The accuracy of the model

Dear author,
I followed the training step strictly and used the data_RAD/ file from the link you offered. But I can only get around 59% accuracy when testing BAN_MEVF model.
I could reproduce the accuracy of 62% using the pretrained BAN_MEVF model.
I wonder if it’s because the pretrained ae model or maml model in data_RAD/ is not the best?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.