Giter Club home page Giter Club logo

dupl's Introduction

DuPL

This repository contains the source code of CVPR 2024 paper: "DuPL: Dual Student with Trustworthy Progressive Learning for Robust Weakly Supervised Semantic Segmentation".

arXiv

πŸ“’ Update Log

  • Mar. 21, 2024 (U2): Add the evaluation / visualization scripts for CAM and segmentation inference.
  • Mar. 21, 2024 (U1): The pre-trained checkpoints and segmentation results released πŸ”₯πŸ”₯πŸ”₯.
  • Mar. 17, 2024: Basic training code released.

Get Started

Training Environment

The implementation is based on PyTorch 1.13.1 with single-node multi-gpu training. Please install the required packages by running:

pip install -r requirements.txt

Datasets

VOC dataset

1. Download from official website

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
tar –xvf VOCtrainval_11-May-2012.tar

2. Download the augmented annotations

The augmented annotations are from SBD dataset. Here is a download link of the augmented annotations at DropBox. After downloading SegmentationClassAug.zip, you should unzip it and move it to VOCdevkit/VOC2012. The directory should be:

VOCdevkit/
└── VOC2012
    β”œβ”€β”€ Annotations
    β”œβ”€β”€ ImageSets
    β”œβ”€β”€ JPEGImages
    β”œβ”€β”€ SegmentationClass
    β”œβ”€β”€ SegmentationClassAug
    └── SegmentationObject
COCO dataset

1. Download

wget http://images.cocodataset.org/zips/train2014.zip
wget http://images.cocodataset.org/zips/val2014.zip

2. Generating VOC style segmentation labels for COCO

The training pipeline use the VOC-style segmentation labels for COCO dataset, please download the converted masks from One Drive. The directory should be:

MSCOCO/
β”œβ”€β”€ coco2014
β”‚    β”œβ”€β”€ train2014
β”‚    └── val2014
└── SegmentationClass (extract the downloaded "coco_mask.tar.gz")
     β”œβ”€β”€ train2014
     └── val2014

NOTE: You can also use the scripts provided at this repo to covert the COCO segmentation masks.

Experiments

Training DuPL

To train the segmentation model for the VOC dataset, please run

# For Pascal VOC 2012
python -m torch.distributed.run --nproc_per_node=2 train_final_voc.py --data_folder [../VOC2012]

NOTE:

  • The vaild batch size should be 4 (num_gpus * sampler_per_gpu).
  • The --nproc_per_node should be set according to your environment (recommend: 2x NVIDIA RTX 3090 GPUs).

To train the segmentation model for the MS COCO dataset, please run

# For MSCOCO
python -m torch.distributed.run --nproc_per_node=4 train_final_coco.py --data_folder [../MSCOCO/coco2014]

NOTE:

  • The vaild batch size should be 8 (num_gpus * sampler_per_gpu).
  • The --nproc_per_node should be set according to your environment (recommend: 4x NVIDIA RTX 3090 GPUs).

Evaluation

Please install pydensecrf first:

pip install git+https://github.com/lucasb-eyer/pydensecrf.git

NOTE: using pip install pydensecrf will install an incompatible version ⚠️.

To evaluate the trained model, please run:

# For Pascal VOC
python eval_seg_voc.py --data_folder [../VOC2012] --model_path [path_to_model]

# For MSCOCO
python -m torch.distributed.launch --nproc_per_node=4 eval_seg_coco_ddp.py --data_folder [../MSCOCO/coco2014] --label_folder [../MSCOCO/SegmentationClass] --model_path [path_to_model]

NOTE:

  • The segmentation results will be saved at the checkpoint directory
  • DuPL has two independent models (students), and we select the best one for evaluation. IN FACT, we can use some tricks, such as ensemble or model soup, to further improve the performance (maybe).

Convert rgb segmentation labels for the official VOC evaluation:

# modify the "dir" and "target_dir" before running
python convert_voc_rgb.py

CAM inference & evaluation:

python infer_cam_voc.py --data_folder [../VOC2012] --model_path [path_to_model]

NOTE: The CAM results will be saved at the checkpoint directory.

TIPS:

  • The evaluation on MS COCO use DDP to accelerate the evaluation stage. Please make sure the torch.distributed.launch is available in your environment.
  • We highly recommend use high-performance CPU for CRF post-processing. This processing is quite time-consuming. On MS COCO, it may cost several hours for CRF post-processing.

Results

Checkpoints

We have provided DuPL's pre-trained checkpoints on VOC and COCO datasets. With these checkpoints, it should be expected to reproduce the exact performance listed below.

Dataset val Log Weights val (with MS+CRF) test (with MS+CRF)
VOC 69.9 log weights 72.2 71.6
VOC (21k) -- log weights 73.3 72.8
COCO -- log weights 43.5 --
COCO (21k) -- log weights 44.6 --

The VOC test results are evaluated on the official server, and the result links are provided in the paper.

Visualization

We have provided the visualization of CAMs and segmentation images (RGB) on VOC 2012 (val and test) and MS COCO in the following links. Hope they can help you to easily compare with other works :)

Dataset Link Model
VOC - Validaion dupl_voc_val.zip DuPL (VOC test: 71.6)
VOC - Test dupl_voc_test.zip DuPL (VOC test: 71.6)
COCO - Validation dupl_coco_val.zip DuPL (COCO val: 43.5)

Citation

Please kindly cite our paper if you find it's helpful in your work:

@inproceedings{wu2024dupl,
  title={DuPL: Dual Student with Trustworthy Progressive Learning for Robust Weakly Supervised Semantic Segmentation},
  author={Wu, Yuanchen and Ye, Xichen and Yang, Kequan and Li, Jide and Li, Xiaoqiang},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={3534--3543},
  year={2024}
}

Acknowledgement

We would like to thank all the researchers who open source their works to make this project possible, especially thanks to the authors of Toco for their brilliant work.

dupl's People

Contributors

wu0409 avatar

Stargazers

 avatar Jarch Ma avatar  avatar Hanson_ avatar An-zhi WANG avatar Manan N/A Biyani avatar Taurus Zhou avatar  avatar fyan avatar  avatar  avatar Jiangpengtao avatar  avatar pipizhum avatar  avatar Hangzhou He avatar yahooo avatar  avatar Lujia Jin avatar Jingfeng Tang avatar  avatar Debug_Yann avatar  avatar Tigger avatar bryce avatar azhu avatar  avatar Zezhong Li avatar  avatar  avatar  avatar  avatar HuigenYe avatar TellMeWhy1122 avatar  avatar Lex avatar  avatar Zresnso avatar LOL avatar  avatar TwinDagger avatar  avatar mask avatar kkk avatar john river avatar  avatar xiehongying avatar Yao Yao avatar  avatar Shatong (Andy) Zhu avatar Davinci_达芬ε₯‡ avatar  avatar ε€§δΈ€ε­¦η”Ÿ avatar a7 avatar Sun Jiao avatar JC avatar yezisir248 avatar  avatar

Watchers

Kostas Georgiou avatar  avatar

Forkers

ykq527 cv-seg

dupl's Issues

Questions about discrepancy loss

Thank you very much for your previous answer. I found that the discrepancy loss in code is diffrent from which in your paper.Which one is right?

custom dataset question

Hi Dear Author, I want to use custom dataset for training, but my custom dataset doesn't have img_box, but your code is based on voc or coco dataset for training, it is covering img_box, how do I need to change the code?

RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one.

Thanks for the awesome work! It is really interesting and powerful!
But when I reproduce this work in training VOC dataset, it occured an error which is "RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. Since find_unused_parameters=True is enabled, this likely means that not all forward outputs participate in computing loss. You can fix this by making sure all forward function outputs participate in calculating loss."(By the way, I only have a single GPU.)
Could you give me some advice please?Thanks for your help!

Reproduction of results

Hello, author.I followed the author's instructions and the final segmentation result of mIoU was only 64.2% on the VOC val. And the CAM result of mIoU was only 67.1% on the VOC val.
train.log
This is my train.log.
Can you provide instructions for all the training steps?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.