Giter Club home page Giter Club logo

wss's Introduction

PseudoSeg: Designing Pseudo Labels for Semantic Segmentation

PseudoSeg is a simple consistency training framework for semi-supervised image semantic segmentation, which has a simple and novel re-design of pseudo-labeling to generate well-calibrated structured pseudo labels for training with unlabeled or weakly-labeled data. It is implemented by Yuliang Zou (research intern) in 2020 Summer.

This is not an official Google product.

Instruction

Installation

  • Use a virtual environment
virtualenv -p python3 --system-site-packages env
source env/bin/activate
  • Install packages
pip install -r requirements.txt

Dataset

Create a dataset folder under the ROOT directory, then download the pre-created tfrecords for voc12 and coco, and extract them in dataset folder. You may also want to check the filenames for each split under data_splits folder.

Training

NOTE:

  • We train all our models using 16 V100 GPUs.
  • The ImageNet pre-trained models can be download here.
  • For VOC12, ${SPLIT} can be 2_clean, 4_clean, 8_clean, 16_clean_3 (representing 1/2, 1/4, 1/8, and 1/16 splits), NUM_ITERATIONS should be set to 30000.
  • For COCO, ${SPLIT} can be 32_all, 64_all, 128_all, 256_all, 512_all (representing 1/32, 1/64, 1/128, 1/256, and 1/512 splits), NUM_ITERATIONS should be set to 200000.

Supervised baseline

python train_sup.py \
  --logtostderr \
  --train_split="${SPLIT}" \
  --model_variant="xception_65" \
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --train_crop_size="513,513" \
  --num_clones=16 \
  --train_batch_size=64 \
  --training_number_of_steps="${NUM_ITERATIONS}" \
  --fine_tune_batch_norm=true \
  --tf_initial_checkpoint="${INIT_FOLDER}/xception_65/model.ckpt" \
  --train_logdir="${TRAIN_LOGDIR}" \
  --dataset_dir="${DATASET}"

PseudoSeg (w/ unlabeled data)

python train_wss.py \
  --logtostderr \
  --train_split="${SPLIT}" \
  --train_split_cls="train_aug" \
  --model_variant="xception_65" \
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --train_crop_size="513,513" \
  --num_clones=16 \
  --train_batch_size=64 \
  --training_number_of_steps="${NUM_ITERATIONS}" \
  --fine_tune_batch_norm=true \
  --tf_initial_checkpoint="${INIT_FOLDER}/xception_65/model.ckpt" \
  --train_logdir="${TRAIN_LOGDIR}" \
  --dataset_dir="${DATASET}"

PseudoSeg (w/ image-level labeled data)

python train_wss.py \
  --logtostderr \
  --train_split="${SPLIT}" \
  --train_split_cls="train_aug" \
  --model_variant="xception_65" \
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --train_crop_size="513,513" \
  --num_clones=16 \
  --train_batch_size=64 \
  --training_number_of_steps="${NUM_ITERATIONS}" \
  --fine_tune_batch_norm=true \
  --tf_initial_checkpoint="${INIT_FOLDER}/xception_65/model.ckpt" \
  --train_logdir="${TRAIN_LOGDIR}" \
  --dataset_dir="${DATASET}" \
  --weakly=true

Evaluation

NOTE: ${EVAL_CROP_SIZE} should be 513,513 for VOC12, 641,641 for COCO.

python eval.py \
  --logtostderr \
  --eval_split="val" \
  --model_variant="xception_65" \
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --eval_crop_size="${EVAL_CROP_SIZE}" \
  --checkpoint_dir="${TRAIN_LOGDIR}" \
  --eval_logdir="${EVAL_LOGDIR}" \
  --dataset_dir="${DATASET}" \
  --max_number_of_evaluations=1

Visualization

NOTE: ${VIS_CROP_SIZE} should be 513,513 for VOC12, 641,641 for COCO.

python vis.py \
  --logtostderr \
  --vis_split="val" \
  --model_variant="xception_65" \
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --vis_crop_size="${VIS_CROP_SIZE}" \
  --checkpoint_dir="${CKPT}" \
  --vis_logdir="${VIS_LOGDIR}" \
  --dataset_dir="${PASCAL_DATASET}" \
  --also_save_raw_predictions=true

Citation

If you use this work for your research, please cite our paper.

@article{zou2020pseudoseg,
  title={PseudoSeg: Designing Pseudo Labels for Semantic Segmentation},
  author={Zou, Yuliang and Zhang, Zizhao and Zhang, Han and Li, Chun-Liang and Bian, Xiao and Huang, Jia-Bin and Pfister, Tomas},
  journal={International Conference on Learning Representations (ICLR)},
  year={2021}
}

wss's People

Contributors

zizhaozhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wss's Issues

Concrete results on COCO

Hi,

Would you mind telling the concrete results of PseudoSeg and Supervised in Figure 3b of the COCO dataset? It will help me a lot, thanks.

Question about Tab.5 which compares different pseudo labeling strategies

Thanks for your impressive work!

I notice that in Tab.5, different pseudo labeling strategies are compared. Whether the numbers listed here are mIOU on validation set or mIOU of pseudo masks on unlabeled images from the training set?

It seems the latter one, if so, are these unlabeled images from the VOC original training set(1,464 images) or the augmented SBD images?

Security Policy violation Binary Artifacts

This issue was automatically created by Allstar.

Security Policy Violation
Project is out of compliance with Binary Artifacts policy: binaries present in source code

Rule Description
Binary Artifacts are an increased security risk in your repository. Binary artifacts cannot be reviewed, allowing the introduction of possibly obsolete or maliciously subverted executables. For more information see the Security Scorecards Documentation for Binary Artifacts.

Remediation Steps
To remediate, remove the generated executable artifacts from the repository.

First 10 Artifacts Found

  • third_party/deeplab/pycache/init.cpython-36.pyc
  • third_party/deeplab/pycache/common.cpython-36.pyc
  • third_party/deeplab/core/pycache/init.cpython-36.pyc
  • third_party/deeplab/core/pycache/conv2d_ws.cpython-36.pyc
  • third_party/deeplab/core/pycache/dense_prediction_cell.cpython-36.pyc
  • third_party/deeplab/core/pycache/feature_extractor.cpython-36.pyc
  • third_party/deeplab/core/pycache/nas_cell.cpython-36.pyc
  • third_party/deeplab/core/pycache/nas_genotypes.cpython-36.pyc
  • third_party/deeplab/core/pycache/nas_network.cpython-36.pyc
  • third_party/deeplab/core/pycache/preprocess_utils.cpython-36.pyc
  • Run a Scorecards scan to see full list.

Additional Information
This policy is drawn from Security Scorecards, which is a tool that scores a project's adherence to security best practices. You may wish to run a Scorecards scan directly on this repository for more details.


Allstar has been installed on all Google managed GitHub orgs. Policies are gradually being rolled out and enforced by the GOSST and OSPO teams. Learn more at http://go/allstar

This issue will auto resolve when the policy is in compliance.

Issue created by Allstar. See https://github.com/ossf/allstar/ for more information. For questions specific to the repository, please contact the owner or maintainer.

About the device

Hi, thx for your work!
I am trying to reproduce your work on 2 gpu,
so I run the following comand:

python train_wss.py
--logtostderr
--train_split="8_clean"
--train_split_cls="train_aug"
--model_variant="xception_65"
--atrous_rates=6
--atrous_rates=12
--atrous_rates=18
--output_stride=16
--decoder_output_stride=4
--train_crop_size="513,513"
--num_clones=2
--train_batch_size=8
--training_number_of_steps="30000"
--fine_tune_batch_norm=true
--tf_initial_checkpoint="/wss-master/init/deeplabv3_xception_2018_01_04/model.ckpt"
--train_logdir="/wss-master/train_log"
--dataset_dir="/wss-master/data/pascal_voc_seg"

But it seems that the gpus are not utilized, the code just run on cpus,
I'm not quite familiar with tensorflow, could you plz give me some hints to solve this problem?
Thx!

About the results showed in talbe 1

Hi, sorry to bother you and thx for your sharing.
Could you plz tell me how to set the --train_split="${SPLIT}" when I want to reproduce your results shown in table1(the result of semi-supervised setting with 1.4k as labeled data)?
Should it be --train_split="8_clean"? Or this split is for the low-data setting?
Thanks for your help!

More data splits information

Hi, thanks for sharing! I have two questions about the data splits:

  1. Could you please supply 2_clean.txt for PASCAL VOC?
  2. Could you tell me which split of 1/16 data @voc did you use in Table 2 in the main paper? Since you have supplied 3 different splits of 1/16 data in this repo, which makes me confused.

Thanks again!

Segmentation fault

Hi,
Thank you so much for your work! I'd like to try it on a different dataset and I was wondering if you could guide me through the most important things that I have to prepare to be able to run your code?
I started with the most basic thing. I created a dataset directory and downloaded the pre-created tfrecords for voc12 put them in dataset.
I wanted to try the training on one GPU, so I ran python3 train_sup.py --num_clones 1 --train_logdir logs/ --dataset_dir dataset/ but I am getting segmentation fault error. What do you think I am doing wrong?

Thank you so much in advance!

tfrecord file?

If I want to use your method on other datasets, how can I generate the tfrecord file ? Can you share the code for generating tfrecord?

Concern on the details of the comparison results in Table-2

Really nice paper!

We carefully read your work and find the experimental settings on Pascal-VOC in Table-2 (as shown below) is really interesting: on the last column of Table-2, all the methods only use 92 images as the labeled set and choose the train-aug set (10582) as the unlabeled set according to the code :

wss/core/data_generator.py

Lines 85 to 104 in 8069dbe

_PASCAL_VOC_SEG_INFORMATION = DatasetDescriptor(
splits_to_sizes={
'train': 1464,
'train_aug': 10582,
'trainval': 2913,
'val': 1449,
# Splits for semi-supervised
'4_clean': 366,
'8_clean': 183,
# Balanced 1/16 split
# Sample with rejection, suffix represents the sample index
# e.g., 16_clean_3 represents the 3rd random shuffle to sample 1/16
# split, given a fixed random seed 8888
'16_clean_3': 92,
'16_clean_14': 92,
'16_clean_20': 92,
# More images
'coco2voc_labeled': 91937,
'coco2voc_unlabeled': 215340,
},

and,

split_name=FLAGS.train_split_cls,

Our understanding is that the FLAGS.train_split_cls represents the set of unlabeled images used for training and its value is train_aug by default. So the number of unlabeled images is nearly more than 100x than the number of unlabeled images. Given that the total training iteration number is set as training_number_of_steps=30000, therefore, we will iterate the sampled 92 labeled images for nearly 30000x64/92=20869 epochs. Is my understanding correct?

If my understanding is correct, we are curious about whether training for so many epochs on the 92 labeled images is a good choice. Besides, as the train-aug set (10582) contains the 92 labeled images, so we guess all the methods also apply the pseudo-label based methods/consistency based methods on the labeled images (instead of only on the unlabeled images).

Great thanks and wait for your explanation if my understanding is wrong!

image

class problem of the addational data training

Thanks for the great work.
In section 4.5 of the published paper, the different datasets were combined to train a model in a supervised way. However, the class labels of the different datasets are different, i.e. COCO, VOC, and Cityscapes, so how to figure out this problem during training?
Looking forward to your reply.

Influence of the color jittering parameters

Great work! We find that you apply only the color jittering augmentation as the strong augmentation. So we are very interested in the influence of the choice of the color jittering parameters.

For example, the default setting in the release code is,

brightness = 0.5
contrast = 0.5
saturation = 0.5
hue = 0.25

According to the previous SimCLR paper, we know they set them as follows:

  brightness = 0.8
  contrast = 0.8
  saturation = 0.8
  hue = 0.2

It would be great if you could share more results of the influence on the choices of these four hyperparameters!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.