Giter Club home page Giter Club logo

det's Introduction

DeT and DOT

Code and datasets for

  1. "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021)
  2. "Depth-only Object Tracking" (BMVC2021)
@InProceedings{yan2021det,
    author    = {Yan, Song and Yang, Jinyu and Kapyla, Jani and Zheng, Feng and Leonardis, Ales and Kamarainen, Joni-Kristian},
    title     = {DepthTrack: Unveiling the Power of RGBD Tracking},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {10725-10733}
}

@InProceedings{yan2021dot,
  title       = {Depth-only Object Tracking},
  author      = {Yan, Song and Yang, Jinyu and Leonardis, Ales and Kamarainen, Joni-Kristian},
  booktitle   = {Proceedings of the British Machine Vision Conference (BMVC)},
  year        = {2021},
  organization= {British Machine Vision Association}
}

DepthTrack Test set (50 Sequences)

Download

DepthTrack Training set (152 Sequences)

Download (100 seqs), Download (52 seqs)

All videoes are 640x360, except 4 sequences in 640x320: painting_indoor_320, pine02_wild_320, toy07_indoor_320 (some gt missing), hat02_indoor_320

Monocular Depth Estimation

[2022.01] : Author found that DPT (Vision Transformers for Dense Prediction) works very well for depth estimation!

[2021.07] : DenseDepth and HighResDepth

Generated LaSOT Depth Images

We manually remove bad sequences, and here are totally 646 sequences (some zip files may be broken, will be updated soon) used the DenseDepth method. Original DenseDepth outputs are in range [0, 1.0], we multiply 2^16. Please check LaSOT for RGB images and groundtruth.

Download (part01), Download (part02), Download (part03), Download (part04), Download (part05),

Download (part06), Download (part07), Download (part08), Download (part09), Download (part10)

Donwload (lion, kangaroo) fix the bad zip files

Donwload (pig, rabbit, robot, rubicCube) fix the bad zip files

Download (lizard, microphone, monkey, motorcycle, person) fix the bad zip files

Generated Got10K Depth Images

Download (0001 - 0700), Download (0701 - 1500), Download (1501 - 2100), Download (2101 - 2600),

Downlaod (2601 - 3200), Download (3201 - 3700), Download (3701 - 4000), Download (4001 - 4300),

Download (4301 - 4500), Downlaod (4501 - 4800), Download (4801 - 5200), Download (5201 - 5500),

Downlaod (5501 - 5800), Download (5801 - 5990), Download (5991 - 6200), Download (6201 - 6400),

Downlaod (6401 - 6700), Download (6701 - 7200), Download (7201 - 7600), Download (7601 - 8000),

Download (8001 - 8700), Download (8701 - 9000), Download (9001 - 9200), Download (9201 - 9335)

Generated COCO Depth Images

Download

How to generate the depth maps for RGB benchmarks

We highly recommend to generate high quality depth data from the existing RGB tracking benchmarks, such as LaSOT, Got10K, TrackingNet, and COCO.

We show the examples of generated depth here. The first row is the results from HighResDepth for LaSOT RGB images, the second and the third are from DenseDepth for Got10K and COCO RGB images, the forth row is for the failure cases in which the targets are too close to the background or floor. The last row is from DenseDepth for CDTB RGB images.

Examples of generated depth images

In our paper, we used the DenseDepth monocular depth estimation method. We calculate the Ordinal Error (ORD) on the generated depth for CDTB and our DepthTrack test set, and the mean ORD is about 0.386, which is sufficient for training D or RGBD trackers and we have tested it in our works.

And we also tried the recently HighResDepth from CVPR2021, which also performs very well.

@article{alhashim2018high,
  title={High quality monocular depth estimation via transfer learning},
  author={Alhashim, Ibraheem and Wonka, Peter},
  journal={arXiv preprint arXiv:1812.11941},
  year={2018}
}

@inproceedings{miangoleh2021boosting,
  title={Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging},
  author={Miangoleh, S Mahdi H and Dille, Sebastian and Mai, Long and Paris, Sylvain and Aksoy, Yagiz},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={9685--9694},
  year={2021}
}

The generated depth maps by using HighResDepth will be uploaded soon.

If you find some excellent methods to generate high quality depth images, please share it.

Architecture

The settings are same as that of Pytracking, please read the document of Pytracking for details.

Actually the network architecture is very simple, just adding one ResNet50 feature extractor for Depth input and then merging the RGB and Depth feature maps. Below figures are

  1. the feature maps for RGB, D inputs and the merged RGBD ones,
  2. the network for RGBD DiMP50, and
  3. RGBD ATOM.

The feature maps for RGB, D and the merged RGBD The network for RGB+D DiMP50 The network for RGB+D ATOM

Download

  1. Download the training dataset and edit the path in local.py

  2. Download the checkpoints for DeT trackers (in install.sh)

The checkpoints (don't edit it :):

https://drive.google.com/drive/folders/1DHDVhGHYYhoI9mjmgVUoautQe11SIKHL?usp=sharing

These links do not work now !

gdown https://drive.google.com/uc\?id\=1djSx6YIRmuy3WFjt9k9ZfI8q343I7Y75 -O pytracking/networks/DeT_DiMP50_Max.pth
gdown https://drive.google.com/uc\?id\=1JW3NnmFhX3ZnEaS3naUA05UaxFz6DLFW -O pytracking/networks/DeT_DiMP50_Mean.pth
gdown https://drive.google.com/uc\?id\=1wcGJc1Xq_7d-y-1nWh6M7RaBC1AixRTu -O pytracking/networks/DeT_DiMP50_MC.pth
gdown https://drive.google.com/uc\?id\=17IIroLZ0M_ZVuxkGN6pVy4brTpicMrn8 -O pytracking/networks/DeT_DiMP50_DO.pth
gdown https://drive.google.com/uc\?id\=17aaOiQW-zRCCqPePLQ9u1s466qCtk7Lh -O pytracking/networks/DeT_ATOM_Max.pth
gdown https://drive.google.com/uc\?id\=15LqCjNelRx-pOXAwVd1xwiQsirmiSLmK -O pytracking/networks/DeT_ATOM_Mean.pth
gdown https://drive.google.com/uc\?id\=14wyUaG-pOUu4Y2MPzZZ6_vvtCuxjfYPg -O pytracking/networks/DeT_ATOM_MC.pth

Install

bash install.sh path-to-anaconda DeT

Train

Using the default DiMP50 or ATOM pretrained checkpoints can reduce the training time.

For example, move the default dimp50.pth into the checkpoints folder and rename as DiMPNet_Det_EP0050.pth.tar

python run_training.py bbreg DeT_ATOM_Max
python run_training.py bbreg DeT_ATOM_Mean
python run_training.py bbreg DeT_ATOM_MC

python run_training.py dimp DeT_DiMP50_Max
python run_training.py dimp DeT_DiMP50_Mean
python run_training.py dimp DeT_DiMP50_MC

Test

python run_tracker.py atom DeT_ATOM_Max --dataset_name depthtrack --input_dtype rgbcolormap
python run_tracker.py atom DeT_ATOM_Mean --dataset_name depthtrack --input_dtype rgbcolormap
python run_tracker.py atom DeT_ATOM_MC --dataset_name depthtrack --input_dtype rgbcolormap

python run_tracker.py dimp DeT_DiMP50_Max --dataset_name depthtrack --input_dtype rgbcolormap
python run_tracker.py dimp DeT_DiMP50_Mean --dataset_name depthtrack --input_dtype rgbcolormap
python run_tracker.py dimp DeT_DiMP50_MC --dataset_name depthtrack --input_dtype rgbcolormap
python run_tracker.py dimp DeT_DiMP50_DO --dataset_name depthtrack --input_dtype colormap


python run_tracker.py dimp dimp50 --dataset_name depthtrack --input_dtype color
python run_tracker.py atom default --dataset_name depthtrack --input_dtype color

det's People

Contributors

xiaozai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

det's Issues

About Datasets Download

Hello, may I ask why there is an additional download link provided for the first sequence file in the dataset download link you provided? Does this signify the latest version of that sequence?
1

Evaluation code

Hi,
Is there an evaluation code for this dataset? Or how did you evaluate it?

NAN convert

depthtrack.py
gt = pandas.read_csv(bb_anno_file, delimiter=',', header=None, dtype=np.float32, na_filter=False, low_memory=False).values
maybe na_filter should be true?

running on custom dataset

Hi! I have already trained a detector on my dataset. What I would like to do now is to run a tracker of yours on my dataset, providing in input the results from detections.
For example I have succesfully run some tracking algorithms as SORT, providing in input a format file as the MOT standard input.
can you guide me in how to provide my dataset in input to your method?

Questions on DeT training datasets

Hi,

Thanks for the brilliant work on RGB-D tracking.

I am new to the tracking domain. I have several questions regarding the tracking datasets. Can you please help me to clarify several points?

Firstly, in the paper it mentions that the DeT is firstly pretrained on Pseudo LaSOT and Pseudo Coco, and then finetuned on DepthTrack. I would like to know if it makes any changes if we directly train on both three datasets?

Secondly, on the github it mentions that Using the default DiMP50 or ATOM pretrained checkpoints can reduce the training time. It seems that DiMP or ATOM are pretrained with larger RGB datasets (trackingnet, got10k, etc etc). I would like to know if these pretrained weights are adopted to initialize the model weight to produce the paper results? Or in the paper the network is only trained with Pseudo LaSOT, Pseudo Coco, and DepthTrack.

Finally, one question regarding the Table 2: Comparison of the original RGB trackers and their DeT variants. How are the RGB baseline trained? Only with RGB images from Pseudo LaSOT, Pseudo Coco, and DepthTrack? Are they only initialized with pretrained encoder (Imagenet)?

Sorry to bother you with all these questions... Looking forward to hearing from you.
Thanks again

About depth image bit?

In "DepthTrack: Unveiling the Power of RGBD Tracking" it said

RGB images were stored as 24-bit JPEG with low compression rate and the depth frames as 16-bit PNG.

I use following code to read depth png:

from PIL import Image
import numpy as np
a = np.array(Image.open('....../adapter02_indoor/depth/00000001.png'))

And I find it's dtype is int32, how to understand?

per attribute F-cores

Hello,

I want to know how to generate the per attribute F-scores (like the Figure 7 in your paper) when analyse the results of trackers?
Is this figure generated by using VOT tookit, and if so, how does it work?
At your convenience, would you please help me with this problem?

Thanks for your assistance.

DepthTrack Train Set

请问DepthTrack Training Set中groundtruth.txt每行表示的是(xmin,ymin,w,h)吗?
另外有的帧gt四个值中后面两个是负值,这个代表什么意思?0值是表示遮挡吗?

DepthTrack download

Hello, thank you for your excellent work!
When I want to download DepthTrack dataset, I find the website is not stable and the download speed is very slow (8kb/s).
So would you please release a baiduyun version of the dataset? Thank you so much.

Question about train on DepthTrack dataset

Hello, I tried to train the DeT DiMP50 Mean on DepthTrack dataset, I set the path and selected a few sequences to testing the training. When I ran “python run_training.py dimp DeT_DiMP50_Mean”, the error raised.

I tracked the error, it was in ltr/trainers/base_trainers.py:

Restarting training from last epoch ...
/home/cat/ljt/DeT/checkpoints/ltr/dimp/DeT_DiMP50_Mean/DiMPnet_DeT_ep*.pth.tar
Training crashed at epoch 51
Traceback for the error!
Traceback (most recent call last):
File "../ltr/trainers/base_trainer.py", line 70, in train
self.train_epoch()
File "../ltr/trainers/ltr_trainer.py", line 80, in train_epoch
self.cycle_dataset(loader)
File "../ltr/trainers/ltr_trainer.py", line 52, in cycle_dataset
for i, data in enumerate(loader, 1):
File "/home/cat/anaconda3/envs/DeT/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 521, in next
data = self._next_data()
File "/home/cat/anaconda3/envs/DeT/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
return self._process_data(data)
File "/home/cat/anaconda3/envs/DeT/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
data.reraise()
File "/home/cat/anaconda3/envs/DeT/lib/python3.7/site-packages/torch/_utils.py", line 425, in reraise
raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "pandas/_libs/parsers.pyx", line 1095, in pandas._libs.parsers.TextReader._convert_tokens
TypeError: Cannot cast array data from dtype('O') to dtype('float32') according to the rule 'safe'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/cat/anaconda3/envs/DeT/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/home/cat/anaconda3/envs/DeT/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/cat/anaconda3/envs/DeT/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "../ltr/data/sampler.py", line 108, in getitem
seq_info_dict = dataset.get_sequence_info(seq_id)
File "../ltr/dataset/depthtrack.py", line 125, in get_sequence_info
bbox = self._read_bb_anno(depth_path)
File "../ltr/dataset/depthtrack.py", line 95, in _read_bb_anno
gt = pandas.read_csv(bb_anno_file, delimiter=',', header=None, dtype=np.float32, na_filter=False, low_memory=False).values
File "/home/cat/anaconda3/envs/DeT/lib/python3.7/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/home/cat/anaconda3/envs/DeT/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 586, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/cat/anaconda3/envs/DeT/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 488, in _read
return parser.read(nrows)
File "/home/cat/anaconda3/envs/DeT/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 1047, in read
index, columns, col_dict = self._engine.read(nrows)
File "/home/cat/anaconda3/envs/DeT/lib/python3.7/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 229, in read
data = self._reader.read(nrows)
File "pandas/_libs/parsers.pyx", line 783, in pandas._libs.parsers.TextReader.read
File "pandas/_libs/parsers.pyx", line 880, in pandas._libs.parsers.TextReader._read_rows
File "pandas/_libs/parsers.pyx", line 1026, in pandas._libs.parsers.TextReader._convert_column_data
File "pandas/_libs/parsers.pyx", line 1103, in pandas._libs.parsers.TextReader._convert_tokens
ValueError: cannot safely convert passed user dtype of float32 for object dtyped data in column 0

Broken lasot depth zip list

The following is broken lasot depth zip list

'kangaroo-10',
 'kangaroo-11',
 'kangaroo-13',
 'kangaroo-17',
 'kangaroo-20',
 'kangaroo-3',
 'kangaroo-4',
 'kangaroo-6',
 'kangaroo-8',
 'kangaroo-9',
 'lion-10',
 'lion-11',
 'lion-12',
 'lion-15',
 'lion-18',
 'lion-2',
 'lion-20',
 'lion-4',
 'lion-7',
 'lion-8',
 'lizard-10',
 'lizard-11',
 'lizard-12',
 'lizard-13',
 'lizard-14',
 'lizard-15',
 'lizard-17',
 'lizard-19',
 'lizard-3',
 'lizard-5',
 'lizard-6',
 'lizard-8',
 'lizard-9',
 'microphone-16',
 'microphone-17',
 'microphone-18',
 'microphone-19',
 'microphone-3',
 'microphone-6',
 'monkey-1',
 'monkey-10',
 'monkey-11',
 'monkey-13',
 'monkey-14',
 'monkey-15',
 'monkey-16',
 'monkey-17',
 'monkey-2',
 'monkey-20',
 'monkey-3',
 'monkey-4',
 'monkey-5',
 'monkey-6',
 'monkey-8',
 'monkey-9',
 'motorcycle-12',
 'motorcycle-14',
 'motorcycle-17',
 'motorcycle-2',
 'motorcycle-20',
 'motorcycle-3',
 'motorcycle-6',
 'motorcycle-7',
 'motorcycle-9',
 'person-1',
 'person-11',
 'person-14',
 'person-15',
 'person-17',
 'person-18',
 'person-5',
 'person-6',
 'person-9',
 'pig-1',
 'pig-10',
 'pig-11',
 'pig-12',
 'pig-13',
 'pig-14',
 'pig-15',
 'pig-16',
 'pig-17',
 'pig-18',
 'pig-19',
 'pig-5',
 'pig-6',
 'pig-7',
 'rabbit-1',
 'rabbit-10',
 'rabbit-11',
 'rabbit-13',
 'rabbit-15',
 'rabbit-16',
 'rabbit-17',
 'rabbit-2',
 'rabbit-20',
 'rabbit-4',
 'rabbit-7',
 'rabbit-8',
 'rabbit-9',
 'robot-1',
 'robot-10',
 'robot-11',
 'robot-13',
 'robot-16',
 'robot-17',
 'robot-18',
 'robot-19',
 'robot-2',
 'robot-4',
 'robot-5',
 'robot-6',
 'robot-7',
 'robot-8',
 'rubicCube-10',
 'rubicCube-11',
 'rubicCube-12',
 'rubicCube-13',
 'rubicCube-14',
 'rubicCube-15',
 'rubicCube-16',
 'rubicCube-17',
 'rubicCube-18',
 'rubicCube-19',
 'rubicCube-20',
 'rubicCube-4',
 'rubicCube-5',
 'rubicCube-8',
 'rubicCube-9'

Appreciate that your hard fix work.
BR.

Can we use DeT with detector?

Hi~ Thanks for the great work and code!

I run the code with Kinect and found it tracks objects quite well with a user-defined bounding box. I'm wondering if I can use the DeT tracker with some detectors e.g., YOLO/MaskRCNN. For example, the detectors predict the instance segmentation with bboxes and the bboxes are tracked by the DeT. Finally, the DeT will assign these bboxes with new IDs or associate them with previous IDs. However, the new bbox tracking is defined by users in the current setting, how to filter the detections automatically?

Is there any script or a hint for doing this? Thank you very much!

A naming error

Following sequence_list tell us there is a seq called lock_wild in depthtrack test set, but actully it's in training set link you provide and called lock01_wild. At the same time, there is a seq called lock02_indoor in test set link you provide.

Please note this.

Folder structure

Hi @xiaozai

I really appreciate your work and would like to use it for my thesis
But i do not know where i should add the checkpoints and test dataset
Can you tell me how to find that

Thank you and BRs,
Quan Ha

Replicating DeT use limited dataset

Due to my hard drive capacity limitations, I only have access to 1200 GOT-10k depth sequences, as well as the full COCO depth dataset and DepthTrack. I'm trying to replicate DeT_DiMP50_Mean, but I find that the total loss remains around 1 and decreases very little. The validation set loss decreases, and the training speed is very slow, taking nearly an hour for each epoch. When I trained using only the COCO training set, the results were also not good; the training loss decreased, but the validation set loss stayed around 1, and the testing performance kept getting worse. I'm not sure what the problem is. Can I train a good DeT tracker with the limited dataset I have?

training datasets.

Could you please provide your training datasets?..COCO_depth and Lasot_depth datasets.Thank you very much...

How to generate the F-score, Precision and Recall

Hello,

It's really great that you guys have disclosed such an excellent work.

I would like to know how to obtain the F-score, precision and recall values when I get the raw results with '.txt' suffix?

Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.