mcahny / vps Goto Github PK

View Code? Open in Web Editor NEW

310.0 15.0 56.0 94.39 MB

Official pytorch implementation for "Video Panoptic Segmentation" (CVPR 2020 Oral)

License: Other

Python 86.10% C++ 3.58% Cuda 8.85% Shell 0.22% Cython 1.26%

vps's Introduction

VPSNet for Video Panoptic Segmentation

Official implementation for "Video Panoptic Segmentation" (CVPR 2020 Oral)
[Paper] [Dataset] [Project] [Slides] [Codalab]

Dahun Kim, Sanghyun Woo, Joon-Young Lee, and In So Kweon.

Cityscapes-VPS test set evaluation is now available at this Codalab server.

Image-level baseline (left) / VPSNet result (right)

Disclaimer

This repo is tested under Python 3.7, PyTorch 1.4, Cuda 10.0, and mmcv==0.2.14.

Installation

a. This repo is built based on mmdetection commit hash 4357697. Our modifications for VPSNet implementation are listed here. Please refer to INSTALL.md to install the library. You can use following commands to create conda env with related dependencies.

conda create -n vps python=3.7 -y
conda activate vps
conda install pytorch=1.4 torchvision cudatoolkit=10.0 -c pytorch -y
pip install -r requirements.txt
pip install "git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI"
pip install "git+https://github.com/cocodataset/panopticapi.git"
pip install -v -e .

b. You also need to install dependencies for Flownet2 and UPSNet modules.

bash ./init_flownet.sh
bash ./init_upsnet.sh

c. You may also need to download some pretrained weights.

pip install gdown
bash ./download_weights.sh

Dataset

You can download Cityscapes-VPS here. It provides 2500-frame panoptic labels that temporally extend the 500 Cityscapes image-panoptic labels. There are total 3000-frame panoptic labels which correspond to 5, 10, 15, 20, 25, and 30th frames of each 500 videos, where all instance ids are associated over time.

It not only supports video panoptic segmentation (VPS) task, but also provides super-set annotations for video semantic segmentation (VSS) and video instance segmentation (VIS) tasks.

Necessary data for Cityscapes-VPS training, testing, and evaluation are as follows. Please refer to DATASET.md for dataset preparation.

mmdetection
├── mmdet
├── tools
├── configs
├── data
│   ├── cityscapes_vps
│   │   ├── panoptic_im_train_city_vps.json
│   │   ├── panoptic_im_val_city_vps.json
│   │   ├── panoptic_im_test_city_vps.json  
│   │   ├── instances_train_city_vps_rle.json (for training)
│   │   ├── instances_val_city_vps_rle.json 
│   │   ├── im_all_info_val_city_vps.json (for inference)
│   │   ├── im_all_info_test_city_vps.json (for inference)
│   │   ├── panoptic_gt_val_city_vps.json (for VPQ eval)
│   │   ├── train 
│   │   │   ├── img
│   │   │   ├── labelmap
│   │   ├── val
│   │   │   ├── img
│   │   │   ├── img_all
│   │   │   ├── panoptic_video
│   │   ├── test
│   │   │   ├── img_all

VIPER dataset is provided by "Playing for Benchmarks, ICCV 2017". We use the 'day' split of the dataset by converting it into the video panoptic segmentation format. The converted dataset is provided upon request.

Evaluation Metric

Testing

Our trained models are available for download here. Rename it to latest.pth and run the following commands to test the model on Cityscapes-VPS.

FuseTrack model for Video Panoptic Quality (VPQ) on Cityscapes-VPS val set (vpq-λ.txt will be saved.)

python tools/test_vpq.py configs/cityscapes/fusetrack.py \
  work_dirs/cityscapes_vps/fusetrack_vpct/latest.pth \
  --out work_dirs/cityscapes_vps/fusetrack_vpct/val.pkl \
  --pan_im_json_file data/cityscapes_vps/panoptic_im_val_city_vps.json \
  --n_video 50 --mode val \
python tools/eval_vpq.py \
  --submit_dir work_dirs/cityscapes_vps/fusetrack_vpct/val_pans_unified/ \
  --truth_dir data/cityscapes_vps/val/panoptic_video/ \
  --pan_gt_json_file data/cityscapes_vps/panoptic_gt_val_city_vps.json

FuseTrack model VPS inference on Cityscapes-VPS test set

python tools/test_vpq.py configs/cityscapes/fusetrack.py \
  work_dirs/cityscapes_vps/fusetrack_vpct/latest.pth \
  --out work_dirs/cityscapes_vps/fusetrack_vpct/test.pkl \
  --pan_im_json_file data/cityscapes_vps/panoptic_im_test_city_vps.json \
  --n_video 50 --mode test \

Files containing the predicted results will be generated as pred.json and pan_pred/*.png at work_dirs/cityscapes_vps/fusetrack_vpct/test_pans_unified/.

Cityscapes-VPS test split currently only allows evaluation on the codalab server. Please upload submission.zip to Codalab server to see actual performances.

submission.zip
├── pred.json
├── pan_pred.zip
│   ├── 0005_0025_frankfurt_000000_001736.png
│   ├── 0005_0026_frankfurt_000000_001741.png
│   ├── ...
│   ├── 0500_3000_munster_000173_000029.png

Training

Train FuseTrack model on video-level Cityscapes-VPS. We start from initial weights of image panoptic segmentation (IPS) model, pretrained on the original Cityscapes. Place it at work_dirs/cityscapes/fuse_vpct/ and rename to latest.pth and run the following command.

# Multi-GPU distributed training
bash ./tools/dist_train.sh configs/cityscapes/fusetrack.py ${GPU_NUM}
# OR
python ./tools/train.py configs/cityscapes/fusetrack.py --gpus ${GPU_NUM}

Citation

If you use this toolbox or benchmark in your research, please cite this project.

@inproceedings{kim2020vps,
  title={Video Panoptic Segmentation},
  author={Dahun Kim and Sanghyun Woo and Joon-Young Lee and In So Kweon},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2020}
}

Terms of Use

This software is for non-commercial use only. The source code is released under the Attribution-NonCommercial-ShareAlike (CC BY-NC-SA) Licence (see this for details)

Acknowledgements

This project has used utility functions from other wonderful open-sourced libraries. We would especially thank the authors of:

Contact

If you have any questions regarding the repo, please contact Dahun Kim ([email protected]) or create an issue.

vps's People

Contributors

Stargazers

Watchers

vps's Issues

missing libtorch_cpu.so

Does anyone have this issue?

File "/home/user/4TB/tmp/vps/mmdet/ops/nms/nms_wrapper.py", line 4, in
from . import nms_cpu, nms_cuda
ImportError: libtorch_cpu.so: cannot open shared object file: No such file or directory

Thanks,

VIPER dataset

Hi! Will this dataset be released ?

Training error on cityscapes-vps

Hello,

I've encountered a bug when trying to train the network and am wondering if anyone else has encountered a similar issue or has some insight.

So I run this command
python ./tools/train.py configs/cityscapes/fusetrack.py --gpus 4
(I also get the same error with 3 gpus)

I've run the command multiple times, and I get the following error at different iterations (different training samples) each time:
Traceback (most recent call last): File "./tools/train.py", line 108, in <module> main() File "./tools/train.py", line 104, in main logger=logger) File "/vps/mmdet/apis/train.py", line 90, in train_detector _non_dist_train(model, dataset, cfg, validate=validate) File "/vps/mmdet/apis/train.py", line 257, in _non_dist_train runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/runner.py", line 358, in run epoch_runner(data_loaders[i], **kwargs) File "/opt/conda/lib/python3.7/site-packages/mmcv/runner/runner.py", line 264, in train self.model, data_batch, train_mode=True, **kwargs) File "/vps/mmdet/apis/train.py", line 68, in batch_processor losses = model(**data) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 153, in forward return self.gather(outputs, self.output_device) File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 165, in gather return gather(outputs, output_device, dim=self.dim) File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 68, in gather res = gather_map(outputs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 60, in gather_map raise ValueError('All dicts must have the same number of keys')

This is the last line before the error propagates into the pytorch code.
File "/vps/mmdet/apis/train.py", line 68, in batch_processor losses = model(**data)

I've already tried printing the number of keys of all dictionaries inside variable data but the number of keys doesn't vary across iterations. I've also tried rerunning the dataset preparation scripts.

Does anyone have any insight? Thanks!

delete

At where can I find 'data/cityscapes_vps/panoptic_gt_val_city_vps.json'

It is not included in the 'cityscapes-vps-dataset-1.0.zip'

Project site is down

the link https://sites.google.com/view/video-panoptic is broken (error 404)

mmdet-version question

Hi,

According to the INSTALL.md, the mmdet-version seems to be the newest, which requires the mmcv>=0.6.0. However, in requirements,txt file, it specifies mmcv==0.2.14.
Therefore, I am confused about the correct version of mmdet.
Looking forward to your kindly reply. Thanks!

VIPE dataset

Thank you for the nice work!
I tried to download the cityscape-vps dataset and VIPE dataset but the download link on the website does not include VIPE dataset.
Do you have any plans to release the VIPSE dataset?

Pretrained Model Setting

Hi ! It is amazing work! Could you share the training scripts on baseline image model(UPSnet) on cityscapes ?

cannot prepare data

hi
I tried to prepare data descriped in dataset.md. After running the bash command for fetching I am getting this error:
my command:
python prepare_data/fetch_city_images.py --src_dir /home/nazib/Data/leftImg8bit/val/ --dst_dir /home/nazib/Data/cityscapes_vps/ --mode SPLIT

output:
FileNotFoundError: [Errno 2] No such file or directory: '/home/nazib/Data/cityscapes_vps/panoptic_im_SPLIT_city_vps.json'

I couldn't find that json file after unzipping the cityscapes_vps data

Please transfer the old competition to the new CodaLab website

Hi, @mcahny, @joonyoung-cv and @joe-siyuan-qiao .
Thanks for your work on the Cityscapes VPS dataset.

Howerver, the old website of CodaLab (https://competitions.codalab.org/competitions/26183) is now unavailable to submit results for evaluation, so I would like to ask if you have time to transfer the original competition to the new website in the near future. This is very important for our future research. (migration method can view: https://groups.google.com/g/codalab-competitions/c/xJlHTZAgztM)

How to train and test on single image Panoptic Segmentation?

Thank you for sharing your great project.
I hope I can use the project to train and test on my own single image dataset for Panoptic Segmentation.
Could you please provide some suggestions how to convert you project into single image based framework?

Thanks
Zhiwen

Can not reproduce your results

Hi! Thanks for opensourcing your code. However, I can not reproduce your FuseTrack results on cityscapes validation set.
I strat from this ckpt. "./work_dirs/cityscapes/fuse_vpct/cityscapes_fuse_latest.pth".
I only obtain vpq-final with 55.6.

How can I run the test_vps.py and eval_vpq.py program.I cant find the dataset "leftImg8bit_sequence.zip and gtFine.zip"

I cant find the datasets "leftImg8bit_sequence.zip and gtFine.zip" in cityscapes dataset.When I run the test program wil got such error like this:/home/eagle/work/vps-master/tools/config/config.py:180: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
exp_config = edict(yaml.load(f))
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
==> Inside PanopticFuseTrack (Device ID: 0)
--- Load flow module: /home/eagle/work/vps-master/work_dirs/flownet/FlowNet2_checkpoint.pth.tar
[ ] 0/1500, elapsed: 0s, ETA:Traceback (most recent call last):
File "tools/test_vpq.py", line 209, in
main()
File "tools/test_vpq.py", line 158, in main
data_loader)
File "tools/test_vpq.py", line 42, in single_gpu_test
for i, data in enumerate(data_loader):
File "/home/eagle/anaconda3/envs/vps/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in next
data = self._next_data()
File "/home/eagle/anaconda3/envs/vps/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data
return self._process_data(data)
File "/home/eagle/anaconda3/envs/vps/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
data.reraise()
File "/home/eagle/anaconda3/envs/vps/lib/python3.7/site-packages/torch/_utils.py", line 394, in reraise
raise self.exc_type(msg)
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/eagle/anaconda3/envs/vps/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/eagle/anaconda3/envs/vps/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/eagle/anaconda3/envs/vps/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/eagle/work/vps-master/mmdet/datasets/custom.py", line 140, in getitem
return self.prepare_test_img(idx)
File "/home/eagle/work/vps-master/mmdet/datasets/cityscapes_vps.py", line 148, in prepare_test_img
return self.pipeline(results)
File "/home/eagle/work/vps-master/mmdet/datasets/pipelines/compose.py", line 22, in call
data = t(data)
File "/home/eagle/work/vps-master/mmdet/datasets/pipelines/loading.py", line 49, in call
img = mmcv.imread(filename)
File "/home/eagle/anaconda3/envs/vps/lib/python3.7/site-packages/mmcv/image/io.py", line 41, in imread
'img file does not exist: {}'.format(img_or_path))
File "/home/eagle/anaconda3/envs/vps/lib/python3.7/site-packages/mmcv/utils/path.py", line 32, in check_file_exist
raise FileNotFoundError(msg_tmpl.format(filename))
FileNotFoundError: img file does not exist: data/cityscapes_vps/val/img_all/frankfurt_000000_001732_leftImg8bit.png

question about evaluation code

In eval_vpq.py, at step 2, gt/pred_segms initial assignments should be done using deep copy, isn't it?
Which means,

'''

Step 2. Concatenate the collected items -> tube-level.

    vid_pan_gt = np.stack(vid_pan_gt) # [nf,H,W] # nf == nframe
    vid_pan_pred = np.stack(vid_pan_pred) # [nf,H,W]
    vid_gt_segms, vid_pred_segms = {}, {} # from json
    for gt_segms, pred_segms in zip(gt_segms_list, pred_segms_list):
        # aggregate into tube 'area'
        for k in gt_segms.keys():
            if not k in vid_gt_segms:
                #vid_gt_segms[k] = gt_segms[k]
                vid_gt_segms[k] = copy.deepcopy(gt_segms[k])
            else:
                vid_gt_segms[k]['area'] += gt_segms[k]['area']
        for k in pred_segms.keys():
            if not k in vid_pred_segms:
                #vid_pred_segms[k] = pred_segms[k]
                vid_pred_segms[k] = copy.deepcopy( pred_segms[k])
            else:
                vid_pred_segms[k]['area'] += pred_segms[k]['area']

'''

Otherwise, vid_gt/pre_segms[k]['area'] value will be accumulated along multiple frames.

about ‘leftImg8bit_sequence’ data

Hi,

Really thanks for your great work! I am very interest in the topic of VPS and your work have inspired me a lot.
I came across a problem when preparing the data, the leftImg8bit_sequence data is 325GB large and it is really difficult for me to download. I noticed that for the VPS, we only need the 'val' split of the data, would you mind sharing the downloaded 'val' split? Thanks a lot!

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

	OpenMMLab 1.0 branch	OpenMMLab 2.0 branch
MMEngine		0.x
MMCV	1.x	2.x
MMDetection	0.x 、1.x、2.x	3.x
MMAction2	0.x	1.x
MMClassification	0.x	1.x
MMSegmentation	0.x	1.x
MMDetection3D	0.x	1.x
MMEditing	0.x	1.x
MMPose	0.x	1.x
MMDeploy	0.x	1.x
MMTracking	0.x	1.x
MMOCR	0.x	1.x
MMRazor	0.x	1.x
MMSelfSup	0.x	1.x
MMRotate	1.x	1.x
MMYOLO		0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

Cityscapses-VPS images

At where can I download original images of cityscape-VPS?
I found annotated GT images from the link on the github but I can not find original sequential images, not even from the Cityscapes website.

Training and testing at different image resolutions

Hello!
Thank you for sharing this project with us, it's a very impressive piece of work.
I am running though in some issues: I want to train VPS on the CItyscapesVPS dataset at various smaller resolutions. In the fusetrack.py config file, in the train/test pipeline, I am able to change the resize ratios for the input images, but I am confused whether or not this also resizes the ground truths in order to match the inputs. If not, how can I do exactly this: resize both inputs and ground truths to the same size?
Thank you!

can't download from cityscpaes-dataset

The link is not reachable for some reason..
It would be much appreciated if you can provide the alternative way to get those data.
Thanks.

'PanopticFuseTrack' object has no attribute 'prev_bboxes'

python tools/test_vpq.py configs/cityscapes/fusetrack.py work_dirs/cityscapes_vps/fusetrack_vpct/latest.pth --out work_dirs/cityscapes_vps/fusetrack_vpct/test.pkl --pan_im_json_file data/cityscapes_vps/panoptic_im_val_city_vps.json --n_video 50 --mode test
/opt/sherry/lane_decetion/vps/tools/config/config.py:180: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
exp_config = edict(yaml.load(f))
loading annotations into memory...
Done (t=1.01s)
creating index...
index created!
==> Inside PanopticFuseTrack (Device ID: 0)
--- Load flow module: /opt/sherry/lane_decetion/vps/work_dirs/flownet/FlowNet2_checkpoint.pth.tar
[ ] 0/500, elapsed: 0s, ETA:Traceback (most recent call last):
File "tools/test_vpq.py", line 203, in
main()
File "tools/test_vpq.py", line 153, in main
data_loader)
File "tools/test_vpq.py", line 46, in single_gpu_test
result = model(return_loss=False, rescale=not show, **data)
File "/opt/sherry/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/opt/sherry/.local/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/opt/sherry/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/opt/sherry/lane_decetion/vps/mmdet/core/fp16/decorators.py", line 49, in new_func
return old_func(*args, **kwargs)
File "/opt/sherry/lane_decetion/vps/mmdet/models/detectors/base.py", line 104, in forward
return self.forward_test(img, img_meta, **kwargs)
File "/opt/sherry/lane_decetion/vps/mmdet/models/detectors/base.py", line 95, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "/opt/sherry/lane_decetion/vps/mmdet/models/detectors/panoptic_fusetrack.py", line 543, in simple_test
rescale=rescale, is_panoptic=True)
File "/opt/sherry/lane_decetion/vps/mmdet/models/detectors/panoptic_fusetrack.py", line 400, in simple_test_bboxes
if is_first or (not is_first and self.prev_bboxes is None):
File "/opt/sherry/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 576, in getattr
type(self).name, name))
AttributeError: 'PanopticFuseTrack' object has no attribute 'prev_bboxes'

how to solve it

my environment:
gcc version 5.4.0 (GCC)
CUDA Version: 10.1
torch.version '1.4.0'

Demo script for inference on your own video

I wanted to run this against this video https://archive.org/details/0002201705192 and share results on my YT channel.
After 1h of fighting with code I know that I will not be able to finish this.

Do you plan to release tools/demo.py script which will allow to run model against new video/camera without GT to get results in original resolution?

BTW - I had only 1 problem with installation which is nice!
Needed to do this before runing tools/test_vpq.py:

conda install -c https://conda.binstar.org/auto easydict

How to convert cityscapes-vps into cityscapes version？

Hello, thank you for providing such a great work.

I am trying some codebase which typically used for cityscapes dataset, is there any easy way to create cityscapes liked panoptic dataset? Thank you.

I think it will be more helpful if you provide your dataset in standard COCO panoptic segmentation format.

Confusion about epoch setting

Hi mcahny,
I found the settings of training epoch in the code is diiferent from the settings in the paper.
In the paper, set 12 epochs for VIPER and set 144 epochs for CS-VPS.(shown in the picture)

But the code you provided set 12 epochs for the CS-VPS dataset.

Is that only a fine-tuned process in the code?