Giter Club home page Giter Club logo

transt's Introduction

TransT - Transformer Tracking [CVPR2021]

Official implementation of the TransT (CVPR2021) , including training code and trained models.

News

  • 🏆 TransT-M wins VOT2021 Real-Time Challenge with EAOMultistart 0.550! The code will be released soon

Tracker

TransT

[Paper] [Models(google)] [Models(baidu:iiau)] [Raw Results]

This work presents a attention-based feature fusion network, which effectively combines the template and search region features using attention. Specifically, the proposed method includes an ego-context augment module based on self-attention and a cross-feature augment module based on cross-attention. We present a Transformer tracking (named TransT) method based on the Siamese-like feature extraction backbone, the designed attention-based fusion mechanism, and the classification and regression head.

TransT is a very simple and efficient tracker, without online update module, using the same model and hyparameter for all test sets. TransT overview figure ECA and CFA

Results

For VOT2020, we add a mask branch to generate mask, without any hyparameter-tuning. The code of the mask branch will be released soon.

Model LaSOT
AUC (%)
TrackingNet
AUC (%)
GOT-10k
AO (%)
VOT2020
EAO (%)
TNL2K
AUC (%)
OTB100
AUC (%)
NFS
AUC (%)
UAV123
AUC (%)
Speed
Params
TransT-N2 64.2 80.9 69.9 - - 68.1 65.7 67.0 70fps 16.7M
TransT-N4 64.9 81.4 72.3 49.5 51.0 69.4 65.7 69.1 50fps 23.0M

Installation

This document contains detailed instructions for installing the necessary dependencied for TransT. The instructions have been tested on Ubuntu 18.04 system.

Install dependencies

  • Create and activate a conda environment

    conda create -n transt python=3.7
    conda activate transt
  • Install PyTorch

    conda install -c pytorch pytorch=1.5 torchvision=0.6.1 cudatoolkit=10.2
  • Install other packages

    conda install matplotlib pandas tqdm
    pip install opencv-python tb-nightly visdom scikit-image tikzplotlib gdown
    conda install cython scipy
    sudo apt-get install libturbojpeg
    pip install pycocotools jpeg4py
    pip install wget yacs
    pip install shapely==1.6.4.post2
  • Setup the environment
    Create the default environment setting files.

    # Change directory to <PATH_of_TransT>
    cd TransT
    
    # Environment settings for pytracking. Saved at pytracking/evaluation/local.py
    python -c "from pytracking.evaluation.environment import create_default_local_file; create_default_local_file()"
    
    # Environment settings for ltr. Saved at ltr/admin/local.py
    python -c "from ltr.admin.environment import create_default_local_file; create_default_local_file()"

You can modify these files to set the paths to datasets, results paths etc.

  • Add the project path to environment variables
    Open ~/.bashrc, and add the following line to the end. Note to change <path_of_TransT> to your real path.
    export PYTHONPATH=<path_of_TransT>:$PYTHONPATH
    
  • Download the pre-trained networks
    Download the network for TransT and put it in the directory set by "network_path" in "pytracking/evaluation/local.py". By default, it is set to pytracking/networks.

Quick Start

Traning

  • Modify local.py to set the paths to datasets, results paths etc.
  • Runing the following commands to train the TransT. You can customize some parameters by modifying transt.py
    conda activate transt
    cd TransT/ltr
    python run_training.py transt transt

Evaluation

  • We integrated PySOT for evaluation. You can download json files in PySOT or here.

    You need to specify the path of the model and dataset in the test.py.

    net_path = '/path_to_model' #Absolute path of the model
    dataset_root= '/path_to_datasets' #Absolute path of the datasets

    Then run the following commands.

    conda activate TransT
    cd TransT
    python -u pysot_toolkit/test.py --dataset <name of dataset> --name 'transt' #test tracker #test tracker
    python pysot_toolkit/eval.py --tracker_path results/ --dataset <name of dataset> --num 1 --tracker_prefix 'transt' #eval tracker

    The testing results will in the current directory(results/dataset/transt/)

  • You can also use pytracking to test and evaluate tracker. The results might be slightly different with PySOT due to the slight difference in implementation (pytracking saves results as integers, pysot toolkit saves the results as decimals).

Getting Help

If you meet problem, please try searching our Github issues, if you can't find solutions, feel free to open a new issue.

  • ImportError: cannot import name region

Solution: You can just delete from pysot_toolkit.toolkit.utils.region import vot_overlap, vot_float2str in test.py if you don't test VOT2019/18/16. You can also build region by python setup.py build_ext --inplace in pysot_toolkit.

Citation

@inproceedings{TransT,
title={Transformer Tracking},
author={Chen, Xin and Yan, Bin and Zhu, Jiawen and Wang, Dong and Yang, Xiaoyun and Lu, Huchuan},
booktitle={CVPR},
year={2021}
}

Acknowledgement

This is a modified version of the python framework PyTracking based on Pytorch, also borrowing from PySOT and detr. We would like to thank their authors for providing great frameworks and toolkits.

Contact

  • Xin Chen (email:[email protected])

    Feel free to contact me if you have additional questions.

transt's People

Contributors

chenxin-dlut avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

transt's Issues

About Training Dataset

Thank you for your wonderful work, I have a question about the training datasets. I notice that the trackingnet dataset has a total of 12 [0-11] subsets, why only 4 [0-3] subsets are used for training in the paper? Thanks :)

visdom和cuda

问题1:运行run_tracker.py 为何debug参数不管设置数字为几 visdom里面只显示三幅图 而且俩幅图空白?
问题2:Attribute CUDA错误 basetracker.py 最后代码if cfg.CUDA:im_patch = im_patch.cuda() cgf里面并无CUDA参数 直接将其注释改成im_patch = im_patch.cuda()可运行?

当我运行transt.py时出现以下循环导入的问题,请问我该怎么解决呢

Traceback (most recent call last):
File "/home/UserDirectory/hongshengz/TransT-main/pytracking/tracker/transt/transt.py", line 1, in
from pytracking.tracker.base import BaseTracker, SiameseTracker
File "/home/UserDirectory/hongshengz/TransT-main/pytracking/tracker/base/init.py", line 1, in
from .basetracker import BaseTracker
File "/home/UserDirectory/hongshengz/TransT-main/pytracking/tracker/base/basetracker.py", line 2, in
from pytracking.tracker.transt.config import cfg
File "/home/UserDirectory/hongshengz/TransT-main/pytracking/tracker/transt/init.py", line 1, in
from .transt import TransT
File "/home/UserDirectory/hongshengz/TransT-main/pytracking/tracker/transt/transt.py", line 1, in
from pytracking.tracker.base import BaseTracker, SiameseTracker
ImportError: cannot import name 'BaseTracker' from partially initialized module 'pytracking.tracker.base' (most likely due to a circular import) (/home/UserDirectory/hongshengz/TransT-main/pytracking/tracker/base/init.py)

[Errno 13] Permission denied: '/tensorboard'

作者您好,我在运行python run_training.py transt transt时,出现了如标题所示的错误,请问该如何解决?
Traceback (most recent call last):
File "run_training.py", line 55, in
main()
File "run_training.py", line 50, in main
run_training(args.train_module, args.train_name, args.cudnn_benchmark)
File "run_training.py", line 39, in run_training
expr_func(settings)
File "../ltr/train_settings/transt/transt.py", line 98, in run
trainer = LTRTrainer(actor, [loader_train], optimizer, settings, lr_scheduler)
File "../ltr/trainers/ltr_trainer.py", line 30, in init
self.tensorboard_writer = TensorboardWriter(tensorboard_writer_dir, [l.name for l in loaders])
File "../ltr/admin/tensorboard.py", line 13, in init
self.writer = OrderedDict({name: SummaryWriter(os.path.join(self.directory, name)) for name in loader_names})
File "../ltr/admin/tensorboard.py", line 13, in
self.writer = OrderedDict({name: SummaryWriter(os.path.join(self.directory, name)) for name in loader_names})
File "/home/lab318/anaconda3/envs/transt/lib/python3.7/site-packages/torch/utils/tensorboard/writer.py", line 225, in init
self._get_file_writer()
File "/home/lab318/anaconda3/envs/transt/lib/python3.7/site-packages/torch/utils/tensorboard/writer.py", line 256, in _get_file_writer
self.flush_secs, self.filename_suffix)
File "/home/lab318/anaconda3/envs/transt/lib/python3.7/site-packages/torch/utils/tensorboard/writer.py", line 66, in init
log_dir, max_queue, flush_secs, filename_suffix)
File "/home/lab318/anaconda3/envs/transt/lib/python3.7/site-packages/tensorboard/summary/writer/event_file_writer.py", line 72, in init
tf.io.gfile.makedirs(logdir)
File "/home/lab318/anaconda3/envs/transt/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/io/gfile.py", line 653, in makedirs
return get_filesystem(path).makedirs(path)
File "/home/lab318/anaconda3/envs/transt/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/io/gfile.py", line 191, in makedirs
os.makedirs(path, exist_ok=True)
File "/home/lab318/anaconda3/envs/transt/lib/python3.7/os.py", line 213, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/lab318/anaconda3/envs/transt/lib/python3.7/os.py", line 213, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/lab318/anaconda3/envs/transt/lib/python3.7/os.py", line 213, in makedirs
makedirs(head, exist_ok=exist_ok)
[Previous line repeated 1 more time]
File "/home/lab318/anaconda3/envs/transt/lib/python3.7/os.py", line 223, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/tensorboard'

权值文件

请问有百度网盘的TransT权值文件吗

N× Feature Fusion Layer的具体架构

您好!
在您的论文看到Feature Fusion Network的N× Feature Fusion Layer是一种统一表述,请问例如4× Feature Fusion Layer的架构图是怎样的呢?

预训练模型的下载

作者您好,您的工作非常新颖,我非常感兴趣。TransT的预训练模型能不能百度云上传一份呢?谢谢啦。

最后一层CFA的layer_norm

在featurefusion_network.py中,decoderCFA_layer中最后已经使用过了layer_norm,请问为什么还需要单独再加一层decoderCFA_norm再做一遍layer_norm呢?

测评工具

想问一下文章里是用的什么测评工具呢?比如UAV123文章里给的精度是69.1,我用pysot测试结果是67.9

Results on TNL2K dataset

Hi, thanks for sharing your code. We release a new dataset termed TNL2K, and also test your tracker on our dataset. The overall performance looks good. Would you please add the results on this github? Thanks.

image

pytracking测试

您好这边会将pytracking测试时debug参数无论改多少visdom只显示一张跟踪图的问题解决然后上传到github吧,本想自己解决但是能力有限改了几天都不行。

About training with 8 gpu

Hi,

I found that the training time of the final 500 epochs does not change much whether it is 8 cards or 2 cards. In theory, shouldn’t it be accelerated by 4 times with 8 cards?

Thanks!

Influence of fusion layers

Nice work!Very concise framework. I like it!

  1. How does the number of fusion layer influence the performance of TransT? Any experiment on N=[1, 2, 3]?
  2. TransT does not need any online update, really? Will online update boost performance?
  3. How to get HxW prediction from fusion vector? Directly reshape the output of MLP?

I want to transform TransT into a lightweight version for mobile application, any suggestion? Thanks a lot.

关于attention map的问题

作者你好,我看了之前您回答其他人关于如何画attention map的问题,我尝试把代码加了进去`
src12, attn_map1 = self.self_attn1(q1, k1, value=src1, attn_mask=src1_mask,
key_padding_mask=src1_key_padding_mask)
src22, attn_map2 = self.self_attn2(q2, k2, value=src2, attn_mask=src2_mask,
key_padding_mask=src2_key_padding_mask)
attn_map1 = attn_map1.cpu().data.numpy()[0].reshape(256,16,16)[0]
attn_map2 = attn_map2.cpu().data.numpy()[0].reshape(1024,32,32)[0]

    def pltshow(pred_map, name):
       import matplotlib.pyplot as plt
       plt.figure(2)
       pred_frame = plt.gca()
       plt.imshow(pred_map, 'jet')
       pred_frame.axes.get_yaxis().set_visible(False)
       pred_frame.axes.get_xaxis().set_visible(False)
       pred_frame.spines['top'].set_visible(False)
       pred_frame.spines['bottom'].set_visible(False)
       pred_frame.spines['left'].set_visible(False)
       pred_frame.spines['right'].set_visible(False)
       pred_name = '/home/sun/桌面/TransT/heatmap/' + name + '.png'
       plt.savefig(pred_name, bbox_inches='tight', pad_inches=0, dpi=150)
       plt.close(2)

    pltshow(attn_map1, 'aaa')
    pltshow(attn_map2, 'bbb')`
    src12, attn_map11 = self.multihead_attn1(query=self.with_pos_embed(src1, pos_src1),
                               key=self.with_pos_embed(src2, pos_src2),
                               value=src2, attn_mask=src2_mask,
                               key_padding_mask=src2_key_padding_mask)
    src22, attn_map12 = self.multihead_attn2(query=self.with_pos_embed(src2, pos_src2),
                               key=self.with_pos_embed(src1, pos_src1),
                               value=src1, attn_mask=src1_mask,
                               key_padding_mask=src1_key_padding_mask)
    attn_map11 = attn_map11.cpu().data.numpy()[0].reshape(256,32,32)[0]
    attn_map12 = attn_map12.cpu().data.numpy()[0].reshape(1024,16,16)[0]
    pltshow(attn_map11, 'ccc')
    pltshow(attn_map12, 'ddd') 

我加进去之后,发现第二个图bbb一直在左上角有很高的响应值,无论什么序列,还有第四副图ddd,他的高响应值都在四个角上,和论文中的不一样,我想问下为什么会这样,是我的代码有问题吗?期待您的回复!

Also about the inference speed

hello, I have test the model(trans.pth) in GOT10k-test, but I also got the lower inference speed (around 30 fps) than the paper had mentioned(47 fps). Need to note that I use the same GPU: RTX-TITAN with you and I'm sure no other programs occupy the computing resources.
Will the inference speed be affected by other factors?
PS my CPU: Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
Snipaste_2021-05-19_22-01-19

关于论文中N 1--4可视化

您好,请问下论文中的可视化是怎么实现的?我在代码featurefusion_network.py 可以找到每一层得到的src1 src2 ,没有找到相关的可视化代码。

About the inference speed.

hello, I have test the model(trans.pth) in GOT10k LaSOTBenchmark, but I got the lower inference speed(13FPS) than the paper had mentioned(50FPS). On other hand the speed is unstable between different test instance, also same instance but different executing time. PS: Intel(R) Xeon(R) CPU E5-2640; GPU: 1080TI .
image

译码器结构

你好,想问一下译码器最后为什么又多加了一个norm层呢?
捕获

论文有一处疑惑

在related work的第一节Visual object tracking中,您提到correlation有两个问题.第一个是不能充分利用全局信息,我能理解。第二个问题,经过correlation,语义信息会有丢失,请问这作何理解?您能解答一下吗?

lasot数据集test

您好!
请问lasot数据集运行test.py文件的时候dataset_root路径是定义哪里呢?没有看到分出来的test文件。感谢!

单独训练got10k或者lasot

现在的版本是同时训练coco,trackingnet,got10k和lasot,训练需要1000epoch,每个epoch 1000次迭代,batch 38。那如果单独训练got10k和lasot只用对应的训练集训练,此时训练样本总数少了,这个训练配置是否需要修改呢?如果还用原始配置是否存在过拟合呢?

trackingnet evaluation

您好!
我在trackingnet官网提交test后的结果进行评估,发现提交的zip文件不符合要求,请问一下您提交的文件如何配置呢?谢谢!

Evaluation on GOT10k dataset

我试着在GOT10k数据集上评测模型,但是在pysot_toolkit/eval.py中没有该数据集的评测代码。
想请教下在现有代码的基础上,如何评测模型在GOT10k上的性能?

万分感谢!

Question in featurefusion_network.py

Dear Chen:

"hs = self.decoder(memory_search, memory_temp, tgt_key_padding_mask=mask_search, memory_key_padding_mask=mask_temp, pos_enc=pos_temp, pos_dec=pos_search)"

I think that the order of input in decoder is firstly memory_temp( i think it is template branch) and secondly memory_search( i think it is search image branch) , because i read your paper I find that k,v of the decode CFA is from above branch and q is from below branch.
Thank you and look forward to your reply.

关于每个epoch迭代1000次

您好! 在训练过程中发现无论batchsize设置多少,每个epoch都是迭代1000次。理论上来说不是训练集的所有图片数量/batchsize 来作为每个epoch的迭代次数呢? 谢谢!

About training time

I used the default parameters, batch_size = 38, samples_per_epoch=1000*batch_size, epoch=1000. According to your answer, training on two Nvidia Titan RTX GPUs for 10 days, an average of 100 epochs a day. I used two RTX3090 GPUs and trained for only 45 epochs for a day and a half. I only added os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" to the code.
os.environ["CUDA_VISIBLE_DEVICES"] = ‘2,3’, what may be the reason for the slow training?
tensorboard
Thanks~

关于缩短训练时间的参数设置

能否提供训练500个epoch时的训练设置TransT/ltr/train_settings/transt/transt.py,(学习率,weight_decay等参数)。我发现我自己修改的使用500个epoch,学习率变为原来的两倍,然后在第250个epoch进行学习率衰减的策略性能下降的比较多,不清楚是哪方面原因造成的,谢谢。

visdom和CUDA参数

问题1:运行run_tracker.py 为何debug参数不管设置数字为几 visdom里面只显示三幅图 而且俩幅图空白?
问题2:Attribute CUDA错误 basetracker.py 最后代码if cfg.CUDA:im_patch = im_patch.cuda() cgf里面并无CUDA参数 直接将其注释改成im_patch = im_patch.cuda()可运行?

Transt Checkpoint file

Hi,

I wanna work on your tracker and train on my own dateset, unfortunately there is no checkpoint file available in your GitHub,
can you upload the checkpoint file?

Obviously, I will cite your paper after getting appropriate results.

How long does it take to train on two Nvidia Titan RTX GPUs?

hi, it is a really great job!I has a question?In your paper ,you train on two Nvidia Titan RTX GPUs with the batch size of 38, for a total of1000 epochs with 1000 iterations per epoch.And how long does it take to train in this situation? Thanks.

JSON文件

你好,请问你能否将pysot测评时各个数据集所需的json文件发一下

汉宁窗超参数

您好,请问汉宁窗的超参数是如何设置的,是像pysot那样在一个区间里搜索出来的吗?针对不同数据集都只用同一个超参数吗?

eval.py

你好,eval OTB2015时显示没有OTB2015.JSON文件,需要在results文件夹放入json文件?谢谢!
image
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.