licongguan / ilm-assl Goto Github PK

Iterative Loop Method Combining Active and Semi-Supervised Learning for Domain Adaptive Semantic Segmentation

Python 97.86% Shell 2.14%

cityscapes computer-vision data-efficient-learning deep-learning domain-adaptation gtav mechine-learing pytorch semantic-segmentation semi-supervised-learning

ilm-assl's Introduction

Iterative Loop Method Combining Active and
Semi-Supervised Learning for Domain
Adaptive Semantic Segmentation

by Licong Guan, Xue Yuan

This repository provides the official code for the paper Iterative Loop Method Combining Active and Semi-Supervised Learning for Domain Adaptive Semantic Segmentation.

Abstract Semantic segmentation is an important technique for environment perception in intelligent transportation systems. With the rapid development of convolutional neural networks (CNNs), road scene analysis can usually achieve satisfactory results in the source domain. However, guaranteeing good generalization to different target domain scenarios remains a significant challenge. Recently, semi-supervised learning and active learning have been proposed to alleviate this problem. Semi-supervised learning can improve model accuracy with massive unlabeled data, but some pseudo labels containing noise would be generated with limited or imbalanced training data. And there will be suboptimal models if human guidance is absent. Active learning can select more effective data to intervene,while the model accuracy can not be improved because the massive unlabeled data are not used. And the probability of querying sub-optimal samples will increase when the domain difference is too large, increasing annotation cost. This paper proposes an iterative loop method combining active and semi-supervised learning for domain adaptive semantic segmentation. The method first uses semi-supervised to learn massive unlabeled data to improve model accuracy and provide more accurate selection models for active learning. Secondly, combined with the predictive uncertainty sample selection strategy of active learning, manual intervention is used to correct the pseudo-labels. Finally, flexible iterative loops achieve the best performance with minimal labeling cost. Extensive experiments show that our method establishes state-of-the-art performance on tasks of GTAV→Cityscapes, SYNTHIA→Cityscapes, improving by 4.9% mIoU and 5.2% mIoU, compared to the previous best method, respectively.

For more information on ILM-ASSL, please check our [Paper].

Usage

Prerequisites

Python 3.6.9
Pytorch 1.8.1
torchvision 0.9.1

Step-by-step installation

git clone https://github.com/licongguan/ILM-ASSL.git && cd ILM-ASSL
conda create -n ILM-ASSL python=3.6.9
conda activate ILM-ASSL
pip install -r requirements.txt
pip install torch==1.8.1+cu102 torchvision==0.9.1+cu102 -f https://download.pytorch.org/whl/torch_stable.html

Data Preparation

Download The Cityscapes Dataset, The GTAV Dataset, and The SYNTHIA Dataset

First, the data folder should be structured as follows:

├── datasets/
│   ├── cityscapes/     
|   |   ├── gtFine/
|   |   ├── leftImg8bit/
│   ├── gtav/
|   |   ├── images/
|   |   ├── labels/
│   └──	synthia/
|   |   ├── RGB/
|   |   ├── LABELS/

Second, generate _labelTrainIds.png for the cityscapes dataset:

pip install cityscpaesscripts
pip install cityscpaesscripts[gui]
export CITYSCAPES_DATASET='/path_to_cityscapes'
csCreateTrainIdLabelImgs

Final, rename the gtav and synthia files for by running:

python ILM-ASSL/datasets/rename.py

Prepare Pretrained Backbone

Before training, please download ResNet101 pretrained on ImageNet-1K from one of the following:

Google Drive
Baidu Drive Fetch Code: 0305

After that, modify model_urls in ILM-ASSL/models/resnet.py to </path/to/resnet101.pth>

Model Zoo

GTAV to Cityscapes

We have put our model checkpoints here [Google Drive] [百度网盘] (提取码0305).

Method	Net	budget	mIoU	Chekpoint	Where in Our Paper
ILM-ASSL	V3+	1%	70.0	Google Drive/BaiDu	Table1
ILM-ASSL	V3+	2.2%	75.0	Google Drive/BaiDu	Table1
ILM-ASSL	V3+	5.0%	76.1	Google Drive/BaiDu	Table1

SYNTHIA to Cityscapes

Method	Net	budget	mIoU	Chekpoint	Where in Our Paper
ILM-ASSL	V3+	1%	73.2	Google Drive/BaiDu	Table2
ILM-ASSL	V3+	2.2%	76.0	Google Drive/BaiDu	Table2
ILM-ASSL	V3+	5.0%	76.6	Google Drive/BaiDu	Table2

ILM-ASSL Training

We provide the training scripts using Multiple GPU.

# training for GTAV to Cityscapes
# use GTAV 2000 labeled images and Cityscpaes 30(1%) labeled images
cd  experiments/gtav2cityscapes/1.0%
# use torch.distributed.launch
sh train.sh <num_gpu> <port>

ILM-ASSL Testing

sh eval.sh

Acknowledgement

This project is based on the following open-source projects: U²PL and RIPU. We thank their authors for making the source code publically available.

Citation

If you find this project useful in your research, please consider citing:

@article{guan2023iterative,
  title={Iterative Loop Method Combining Active and  Semi-Supervised Learning for Domain  Adaptive Semantic Segmentation},
  author={Guan, Licong and Yuan, Xue},
  journal={arXiv preprint arXiv:2301.13361},
  year={2023}
}

Contact

If you have any problem about our code, feel free to contact

[email protected]

or describe your problem in Issues.

ilm-assl's People

Contributors

Stargazers

Watchers

Forkers

tingtin846 chechensputnik hehongjie

ilm-assl's Issues

active learning

Hello! Thank you for your excellent work on open source. But I wonder where does the active learning code run? Or do i need to run it myself manually?

About the log

Hi, Thank you for sharing a nice work!
I am wondering if there is a log for training since I want to know whether I'm training correctly to reproduce the paper's result.

I'm working with GTA2Cityscapes 1% protocol to reproduce the result. By now, I'm in 50 epoch and the mIoU is around 43 which is far behind the paper result (which is 70).

Thanks in advance!
Joo Young Jang

hello,I want to know how to resolve ''Could not find a version that satisfies the requirement cityscpaesscripts (from versions: none)'' when I use pip install cityscpaesscripts in windows environment.

About Pretraining weight in 2.2%, 5%

Hi, I'm interested with Self training + Active Learning Concept and want to reproduce the results as paper suggested

However, as I saw the log of 2.2%, 5% that you sent, I am confused about the iterative loop.

As far as I understand, 2.2% and 5% is 2nd iteration and the pretrained weight should be from 1% model's final output.
However, the log is telling that pretrained model is from Imagenet.

If I'm wrong, please let me know.

Sincerely, Joo Young Jang

Following result is 5% starting log.

set random seed to 1
[Info] Load ImageNet pretrain from '/media/dell/Elements/DATA/core/models/resnet101.pth'
missing_keys: []
unexpected_keys: ['fc.weight', 'fc.bias']
[Info] Load ImageNet pretrain from '/media/dell/Elements/DATA/core/models/resnet101.pth'
missing_keys: []
unexpected_keys: ['fc.weight', 'fc.bias']
[Info] Load ImageNet pretrain from '/media/dell/Elements/DATA/core/models/resnet101.pth'
missing_keys: []
unexpected_keys: ['fc.weight', 'fc.bias']
[Info] Load ImageNet pretrain from '/media/dell/Elements/DATA/core/models/resnet101.pth'
missing_keys: []
unexpected_keys: ['fc.weight', 'fc.bias']
[2022-10-25 20:21:07,206][ base.py][line: 41][ INFO] # samples: 2150
[2022-10-25 20:21:07,208][ base.py][line: 41][ INFO] # samples: 2150
labeled: 4825
labeled: 4825
[2022-10-25 20:21:07,223][ base.py][line: 41][ INFO] # samples: 2825
[2022-10-25 20:21:07,226][ base.py][line: 41][ INFO] # samples: 2825
unlabeled: 2825
unlabeled: 2825
[2022-10-25 20:21:07,242][ base.py][line: 41][ INFO] # samples: 500
[2022-10-25 20:21:07,242][ base.py][line: 41][ INFO] # samples: 500
[2022-10-25 20:21:07,242][ builder.py][line: 28][ INFO] Get loader Done...
[2022-10-25 20:21:07,242][ builder.py][line: 28][ INFO] Get loader Done...
No checkpoint found in 'checkpoints/ckpt.pth'
[2022-10-25 20:21:07,255][ lr_helper.py][line: 65][ INFO] The kwargs for lr scheduler: 0.9
[2022-10-25 20:21:07,257][ lr_helper.py][line: 65][ INFO] The kwargs for lr scheduler: 0.9
epoch [ 0 : ] sample_rate_target_class_conf [0.10357965 0.0711445 0.04550922 0.09053792 0.05731867 0.11211205
0.07049027 0.05302454 0.08098789 0.0774475 0.04990353 0.0625474
0.04623231 0.07916455]
epoch [ 0 : ] criterion_per_class tensor([0.0901, 0.8649, 0.4227, 0.1821, 0.7204, 0.9933, 0.4074, 0.9022, 0.9825,
0.0326, 0.3988, 0.7301, 0.9449, 0.5666, 0.6234, 0.9697, 0.8400, 0.9904,
0.5960], device='cuda:0')
epoch [ 0 : ] sample_rate_per_class_conf tensor([0.9406, 0.1396, 0.5967, 0.8455, 0.2890, 0.0069, 0.6126, 0.1011, 0.0181,
1.0000, 0.6215, 0.2790, 0.0570, 0.4481, 0.3893, 0.0314, 0.1654, 0.0099,
0.4176], device='cuda:0')

About the synthia dataset

The download link from the synthia website is not working,is there any other way to download it?

About Training Resources.

Hello Author, can STAL be trained on a single GPU, such as the 3090?

Which one should I use when I want to inference? Teacher model? or Student model?

Runtime Error and CudnnBatchNormBackward0 Issue During Code Execution

While running the code, I encountered the error message "RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [256]] is at version 3; expected version 2 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later."
Upon further investigation, I found that the issue is related to "Error detected in CudnnBatchNormBackward0. No forward pass information available. Enable detect anomaly during forward pass for more information." As I am not familiar with distributed training, I would like to inquire if the community has any insights into this problem.

Here are the specific output logs:
log_20230810_161918.txt

I don't find the file "python ILM-ASSL/datasets/rename.py"

About the data path

I noticed that when you use Dataloader in your code, you use cfg["data_list"] directly. In fact, there are two data_lists in cfg, how do you make sure that you are loading the correct list? Also, there are misspellings of variables in your published code, have you ever run your published code?