Giter Club home page Giter Club logo

highmmt's Introduction

HighMMT

HighMMT is a general-purpose model for high-modality (large number of modalities beyond the prototypical language, visual, and acoustic modalities) and partially-observable (across many tasks, where each task is defined only over a small subset of all modalities we are interested in modeling) scenarios.

HighMMT uses multitask learning with shared unimodal and multimodal layers to enable stable parameter counts (addressing scalability) and cross-modal transfer learning to enable information sharing across modalities and tasks (addressing partial observability).

The same HighMMT model (architecture and parameters) is able to simultaneously encode joint representations between different subsets spanning images, text, audio, sets, time-series, and graphs.

Paper

High-Modality Multimodal Transformer: Quantifying Modality & Interaction Heterogeneity for High-Modality Representation Learning
Paul Pu Liang, Yiwei Lyu, Xiang Fan, Shentong Mo, Dani Yogatama, Louis-Philippe Morency, Ruslan Salakhutdinov
TMLR 2022.

If you find this repository useful, please cite our paper:

@article{liang2022high,
  title={High-Modality Multimodal Transformer: Quantifying Modality \& Interaction Heterogeneity for High-Modality Representation Learning},
  author={Liang, Paul Pu and Lyu, Yiwei and Fan, Xiang and Tsaw, Jeffrey and Liu, Yudong and Mo, Shentong and Yogatama, Dani and Morency, Louis-Philippe and Salakhutdinov, Russ},
  journal={Transactions on Machine Learning Research},
  year={2022}
}

Contributors

Correspondence to:

Usage

Environment Setup Using Conda

conda env create -f env_HighMMT.yml

Quick Start

The instructions for running the code and data retreival can be found after typing

./run.sh help

You can also find detailed instructions below

Data Download

three datasets: robotics, enrico and RTFM can be setup directly using script ./download_datasets.sh Run

./download_datasets.sh help

for instructions To setup each dataset, run "./download_datasets.sh " For example

./download_datasets.sh robotics

downloads the robotics dataset to the directory datasets/robotics This repo is built on top of the MultiBench repository, so to download the dataset, follow the same instructions as https://github.com/pliang279/MultiBench.git

Easy setting experiment code

From the root of this repo, run

python private_test_scripts/perceivers/roboticstasks.py model.pt

The model will be saved to model.pt.

Medium setting experiment code

To run medium tasks, please run

python private_test_scripts/perceivers/medium_tasks.py

Hard setting experiment code

To run multitask training on 1/2/3/4 tasks, please run

python private_test_scripts/perceivers/singletask.py
python private_test_scripts/perceivers/twomultitask.py
python private_test_scripts/perceivers/threemultitask.py
python private_test_scripts/perceivers/fourmultitask.py

Parameter Sharing Experiments

To run the parameter sharing experiments, please run

python private_test_scripts/perceivers/shared_fourmulti.py

A baseline can be trained as a starting point for finetuning by running the fourmultitask.py file like described above. You can specify the baseline in shared_fourmulti.py.

Parameter groupings can also be specified in the shared_fourmulti.py file.

Heterogeneity Matrix

To run get the heterogeneity matrix between individual modalitiesa and pairs of modalities, please run

python private_test_scripts/perceivers/tasksim.py

highmmt's People

Contributors

lvyiwei1 avatar pliang279 avatar sfanxiang avatar stonemo avatar yudongl2000 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

highmmt's Issues

Nonexistent pytorch_perciever.

Hello! When I run python private_test_scripts/perceivers/singletask.py, I get the error Traceback (most recent call last): File "private_test_scripts/perceivers/singletask.py", line 5, in <module> from private_test_scripts.perceivers.crossattnperceiver import MultiModalityPerceiver, InputModality File "/run/determined/workdir/irad_users/smithk/nlp/HighMMT/HighMMT/private_test_scripts/perceivers/crossattnperceiver.py", line 9, in <module> from perceiver_pytorch.caching import cache_by_name_fn ModuleNotFoundError: No module named 'perceiver_pytorch.caching'. Could you push the perceiver_pytorch module?

Problems while Running the code

As given in the documentation, I installed the dependencies and tried to run the code:

$ python private_test_scripts/perceivers/roboticstasks.py model.pt

After trying to run that, it gives me some sort of fannypack related error:

The full error is:

python private_test_scripts/perceivers/roboticstasks.py model.pt
Output will be model.pt
Traceback (most recent call last):
  File "/media/4TB_hardisk/sangam/HighMMT/private_test_scripts/perceivers/roboticstasks.py", line 31, in <module>
    trains3, valid3, test3 = PushTask.get_dataloader(
  File "/media/4TB_hardisk/sangam/HighMMT/datasets/gentle_push/data_loader.py", line 84, in get_dataloader
    train_trajectories = cls.get_train_trajectories(**dataset_args)
  File "/media/4TB_hardisk/sangam/HighMMT/datasets/gentle_push/data_loader.py", line 134, in get_train_trajectories
    return _load_trajectories("gentle_push_1000.hdf5", **dataset_args)
  File "/media/4TB_hardisk/sangam/HighMMT/datasets/gentle_push/data_loader.py", line 247, in _load_trajectories
    with fannypack.data.TrajectoriesFile(
  File "/home/sangam/anaconda3/envs/jtsaw/lib/python3.10/site-packages/fannypack/data/_trajectories_file.py", line 77, in __init__
    with self._h5py_file() as f:
  File "/home/sangam/anaconda3/envs/jtsaw/lib/python3.10/site-packages/fannypack/data/_trajectories_file.py", line 354, in _h5py_file
    return h5py.File(self._path, mode=mode, libver="latest")
  File "/home/sangam/anaconda3/envs/jtsaw/lib/python3.10/site-packages/h5py/_hl/files.py", line 562, in __init__
    fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
  File "/home/sangam/anaconda3/envs/jtsaw/lib/python3.10/site-packages/h5py/_hl/files.py", line 235, in make_fid
    fid = h5f.open(name, flags, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 102, in h5py.h5f.open
OSError: Unable to synchronously open file (file signature not found)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.