Giter Club home page Giter Club logo

mfg_rgbt_tracking_pytorch's Introduction

MFGNet_RGBT_Tracking_PyTorch

Official implementation of MFGNet: Dynamic Modality-Aware Filter Generation for RGB-T Tracking, Xiao Wang, Xiujun Shu, Shiliang Zhang, Bo Jiang, Yaowei Wang, Yonghong Tian, Feng Wu, Accepted by IEEE Transactions on Multimedia (TMM), 2022 [Paper]

Abstract:

Many RGB-T trackers attempt to attain robust feature representation by utilizing an adaptive weighting scheme (or attention mechanism). Different from these works, we propose a new dynamic modality-aware filter generation module (named MFGNet) to boost the message communication between visible and thermal data by adaptively adjusting the convolutional kernels for various input images in practical tracking. Given the image pairs as input, we first encode their features with the backbone network. Then, we concatenate these feature maps and generate dynamic modality-aware filters with two independent networks. The visible and thermal filters will be used to conduct a dynamic convolutional operation on their corresponding input feature maps respectively. Inspired by residual connection, both the generated visible and thermal feature maps will be summarized with input feature maps. The augmented feature maps will be fed into the RoI align module to generate instance-level features for subsequent classification. To address issues caused by heavy occlusion, fast motion and out-of-view, we propose to conduct a joint local and global search by exploiting a new direction-aware target driven attention mechanism. The spatial and temporal recurrent neural network is used to capture the direction-aware context for accurate global attention prediction. Extensive experiments on three large-scale RGB-T tracking benchmark datasets validated the effectiveness of our proposed algorithm.

rgbt_car10

Demo:

(Red: Ours, Blue: Ground Truth, Green: RT-MDNet)

rgbt_car10

rgbt_balancebike

rgbt_flower1

rgbt_kite4

Install:

This code is developed based on Python 3.7, PyTorch 1.0, CUDA 10.1, Ubuntu 16.04, Tesla P100 * 4. Install anything it warnings.

RoI align module needs to compile first:

CUDA_HOME=/usr/local/cuda-10.1 python setup.py build_ext --inplace 

Training and Testing:

  1. generate the "50.pkl" with prepro_rgbt.py as the training data;

  2. train the tracker with train.py;

  3. train the rgbt_TANet with train_rgbtTANet.py;

  4. Obtain the attention maps and run the test.py for rgbt-tracking.

Results:

rgbt_kite4

rgbt_kite4

Comparison

Raw results for benchmark comparison: [Raw Results]

Acknowledgement:

Citation:

If you use this code for your research, please cite the following paper:

@article{wang2021mfgnet,
  title={MFGNet: Dynamic Modality-Aware Filter Generation for RGB-T Tracking},
  author={Wang, Xiao and Shu, Xiujun and Zhang, Shiliang and Jiang, Bo and Wang, Yaowei and Tian, Yonghong and Wu, Feng},
  journal={IEEE Transactions on Multimedia},
  year={2022}
}

If you have any questions, feel free to contact me via email: [email protected].

mfg_rgbt_tracking_pytorch's People

Contributors

wangxiao5791509 avatar

Stargazers

Özkan Ardil avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.