Giter Club home page Giter Club logo

ovo's Introduction

One-Versus-Others Multimodal Attention

Description

We present One-Versus-Others (OvO), a new scalable multimodal attention mechanism. The proposed formulation significantly reduces the computational complexity compared to the widely used early fusion through self-attention and cross-attention methods as it scales linearly with number of modalities and not quadratically. OvO outperformed self-attention, cross-attention, and concatenation on four diverse medical datasets, including four-modality, five-modality, and two six-modality datasets. The figure below demonstrated our model:

Requirements

Python 3.9.0

PyTorch Version: 1.13.0+cu117

Torchvision Version: 0.14.0+cu117

To install requirements:

pip install -r requirements.txt

Preprocessing

This paper uses four medical datasets (MIMIC, TADPOLE, TCGA, eICU), two non-medical datasets (Hateful Memes and Amazon Review) and one simulation dataset. The preprocessing steps for each dataset are located in their respective folder in the README.md files. The common_files folder contains scripts that are used by multiple datasets.

Training and hyperparameter tuning

The training in this paper is done hand in hand with hyperparameter tuning using Weights and Biases (Wandb). The training and tuning scripts follow a pattern training_multimodal_hyper.py or training_unimodal_hyper.py. So for example, to train a multimodal model using OvO attention on the MIMIC dataset, you would run the following command:

python3 mimic/training_multimodal_hyper.py OvO /path/to/data /path/to/save/model /path/to/config wandb_project_title

An example config file is provided in common_files/config.json, which includes the full grid we used to find the best hyperparameters. Note that while non-medical datasets, Hateful Memes and Amazon reviews use similar pre-trained models such as Bert and ResNet, and MIMIC uses ClinicalBert, the other medical datasets and simulation dataset use regular neural network encoders. More details about exactly how to train each dataset are located in the README.md files inside each dataset folder.

Evaluation

The evaluation scripts follow a pattern evaluate.py with a multimodal flag set to either True or False. For example, to evaluate a six modality model using OvO attention on the TADPOLE dataset, you would run the following command:

python3 tadpole/evaluate.py True OvO learning_rate epochs batch_size random_seed_list /path/to/test_data number_of_attention_heads

An example config file is provided in common_files/config.json, which includes the full grid we used to find the best hyperparameters. More details about exactly how to evaluate each dataset are located in the README.md files inside each dataset folder, as they differ slightly across datasets.

ovo's People

Contributors

michalg04 avatar

Stargazers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.