Giter Club home page Giter Club logo

etris_vlprj's Introduction

ETRIS

This is an official PyTorch implementation of Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation.

teaser

Overall Architecture

Preparation

  1. Environment

    • PyTorch (e.g. 1.8.1+cu111)
    • Other dependencies in requirements.txt
      pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
      pip install -r requirements.txt
  2. Datasets

  3. Pretrained weights

    • Download the pretrained weights of ResNet-50/101 and ViT-B to pretrain
      mkdir pretrain && cd pretrain
      # ResNet-50
      wget https://openaipublic.azureedge.net/clip/models/afeb0e10f9e5a86da6080e35cf09123aca3b358a0c3e3b6c78a7b63bc04b6762/RN50.pt
      # ResNet-101
      wget https://openaipublic.azureedge.net/clip/models/8fa8567bab74a42d41c5915025a8e4538c3bdbe8804a470a72f30b0d94fab599/RN101.pt
      # ViT-B
      wget https://openaipublic.azureedge.net/clip/models/5806e77cd80f8b59890b7e101eabd078d9fb84e6937f9e85e4ecb61988df416f/ViT-B-16.pt

Quick Start

To do training of ETRIS, modify the script according to your requirement and run:

bash run_scripts/train.sh

To do evaluation of ETRIS, modify the script according to your requirement and run:

bash run_scripts/test.sh

Acknowledgements

The code is based on CRIS. We thank the authors for their open-sourced code and encourage users to cite their works when applicable.

Citation

If ETRIS is useful for your research, please consider citing:

@article{xu2023bridging,
  title={Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation},
  author={Xu, Zunnan and Chen, Zhihong and Zhang, Yong and Song, Yibing and Wan, Xiang and Li, Guanbin},
  journal={arXiv preprint arXiv:2307.11545},
  year={2023}
}

etris_vlprj's People

Contributors

kkakkkka avatar zhjohnchan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.