This repository provides the necessary codes and scripts for preparing our Dance ReID dataset. The videos in the dataset are collected from YouTube Creative Commons. We then perform human pose tracking for each video to extract the human bounding boxes and skeleton landmarks. Finally, we manually filter out label misalignment of the bounding boxes and build up a large-scale dataset for the multi-person tracking re-ID purpose.
Our dataset includes:
- Cropped bounding boxes with human ID labels
- Human pose landmarks for every bounding box (in pixel coordinate)
- Diverse human pose for each person (also with their skeleton rendering map)
We use youtube-dl to automatically download the selected videos. Make sure you are using the up-to-date version.
apt-get install youtube-dl
We implement our code in Python3, please install the following packages
pip3 install Pillow opencv-python tqdm numpy h5py
Google drive (237MB)
Download the selected youtube videos using the following command:
bash run.sh /path/to/video_folder video_data.csv
Generate an image-based dataset for re-ID using the following script:
python3 gen_DanceReID.py -i /path/to/video_folder -n /path/to/npy_folder
-a /path/to/annotation_json -d 5 [ -o /path/to/output_folder ] [ -gs ]
[ --split-folder ] [ -h5 ]
The resulting dataset folder should have the structure as below:
path/to/your/DanceReID/
|-- images/.....................( if using --split-folder flag)
| |-- video_folders/
| ...
|-- poses/......................( if using --split-folder flag)
| |-- video_folders/
| ...
|-- skeleton/ .................( if using -gs flag)
| |-- video_folders/ .........( if using --split-folder flag)
| ...
|-- splits.json
|-- meta.json
|-- video.json
|-- DanceReID.h5 ...............( generated if using -h5 flag)
Note that if you did not apply the --split-folder flag when generating data, there will be no separate video folders.
In our paper, we downsample the videos every 5 frames(using the tag -d 5) for evaluation. This results in the following dataset statistic
subset | # ids | # images | # videos |
---|---|---|---|
trainval | 71 | 31643 | 15 |
test | 29 | 19526 | 6 |
Please find more details for every single video in the csv file.
Note: In our paper, we use only 100 IDs for the experiments. Here we also provide another version of our dataset (a total of 178 IDs in 33 videos) for further research, you can download all videos by replacing this csv file.
Evaluation metric: mAP, CMC-CUHK03 (single gallery shot)
Methods | mAP | rank-1 | rank-5 |
---|---|---|---|
Softmax [xiao2016] | 74.4 | 73.1 | 94.7 |
Siamese [chung2017] | 77.5 | 75.9 | 96.9 |
Triplet [hermans2017] | 78.4 | 77.2 | 97.6 |
ST-ReIDNet | 86.1 | 84.9 | 98.7 |
Check our baseline and model implementation in the ST-ReIDNet repository.