liminn / icnet-pytorch Goto Github PK

ICNet implemented by pytorch, for real-time semantic segmentation on high-resolution images, mIOU=71.0 on cityscapes, single inference time is 19ms, FPS is 52.6.

License: MIT License

Python 100.00%

icnet cityscapes real-time semantic-segmentation pytorch

icnet-pytorch's Introduction

Description

This repo contains ICNet implemented by PyTorch, based on paper by Hengshuang Zhao, and et. al(ECCV'18). Training and evaluation are done on the Cityscapes dataset by default.

Requirements

Python 3.6 or later with the following pip3 install -r requirements.txt:

torch==1.1.0
torchsummary==1.5.1
torchvision==0.3.0
numpy==1.17.0
Pillow==6.0.0
PyYAML==5.1.2

Updates

2019.11.15: change crop_size=960, the best mIoU increased to 71.0%. It took about 2 days. Get icnet_resnet50_197_0.710_best_model.pth

Performance

Method	mIoU(%)	Time(ms)	FPS	Memory(GB)	GPU
ICNet(paper)	67.7%	33ms	30.3	1.6	TitanX
ICNet(ours)	71.0%	19ms	52.6	1.86	GTX 1080Ti

Base on Cityscapes dataset, only train on trainning set, and test on validation set, using only one GTX 1080Ti card, and input size of the test phase is 2048x1024x3.
For the performance of the original paper, you can query the "Table2" in the paper.

Demo

image	predict

All the input images comes from the validation dataset of the Cityscaps, you can switch to the demo/ directory to check more demo results.

Usage

Trainning

First, modify the configuration in the configs/icnet.yaml file:

### 3.Trainning 
train:
  specific_gpu_num: "1"   # for example: "0", "1" or "0, 1"
  train_batch_size: 7    # adjust according to gpu resources
  cityscapes_root: "/home/datalab/ex_disk1/open_dataset/Cityscapes/" 
  ckpt_dir: "./ckpt/"     # ckpt and trainning log will be saved here

Then, run: python3 train.py

Evaluation

First, modify the configuration in the configs/icnet.yaml file:

### 4.Test
test:
  ckpt_path: "./ckpt/icnet_resnet50_197_0.710_best_model.pth"  # set the pretrained model path correctly

Then, run: python3 evaluate.py

Discussion

The structure of ICNet is mainly composed of sub4, sub2, sub1 and head:

sub4: basically a pspnet, the biggest difference is a modified pyramid pooling module.
sub2: the first three phases convolutional layers of sub4, sub2 and sub4 share these three phases convolutional layers.
sub1: three consecutive stried convolutional layers, to fastly downsample the original large-size input images
head: through the CFF module, the outputs of the three cascaded branches( sub4, sub2 and sub1) are connected. Finaly, using 1x1 convolution and interpolation to get the output.

During the training, I found that pyramid pooling module in sub4 is very important. It can significantly improve the performance of the network and lightweight models.

The most import thing in data preprocessing phase is to set the crop_size reasonably, you should set the crop_size as close as possible to the input size of prediction phase, here is my experiment:

I set the base_size to 520, it means resize the shorter side of image between 520x0.5 and 520x2, and set the crop size to 480, it means randomly crop 480x480 patch to train. The final best mIoU is 66.7%.
I set the base_size to 1024, it means resize the shorter side of image between 1024x0.5 and 1024x2, and set the crop_size to 720, it means randomly crop 720x720 patch to train. The final best mIoU is 69.9%.
Beacuse our target dataset is Cityscapes, the image size is 2048x1024, so the larger crop_size(720x720) is better. I have not tried a larger crop_size(such as 960x960 or 1024x1024) yet, beacuse it will result in a very small batch size and is very time-consuming, in addition, the current mIoU is already high. But I believe that larger crop_size will bring higher mIoU.

In addition, I found that a small training technique can improve the performance of the model:

set the learning rate of sub4 to orginal initial learning rate(0.01), because it has backbone pretrained weights.
set the learning rate of sub1 and head to 10 times initial learning rate(0.1), because there are no pretrained weights for them.

This small training technique is really effective, it can improve the mIoU performance by 1~2 percentage points.

Any other questions or my mistakes can be fedback in the comments section. I will replay as soon as possible.

Reference

icnet-pytorch's People

Contributors

Stargazers

Watchers

icnet-pytorch's Issues

train my own dateset

how to train my own dataset,I neeed ，What parameters do I need to modify

Retrained model

"icnet_resnet50_197_0.710_best_model.pth" is not exist?

download resnet50

when I use 'python3 train.py' as readme.
error :

Downloading /home/abc/.torch/models/resnet50-25c4b509.zip from https://hangzh.s3.amazonaws.com/encoding/models/resnet50-25c4b509.zip...
Traceback (most recent call last):
File "train.py", line 229, in
trainer = Trainer(cfg)
File "train.py", line 45, in init
self.model = ICNet(nclass = train_dataset.NUM_CLASS, backbone='resnet50').to(self.device)
File "/home/abc/guoanXu/ICNet-master/models/icnet.py", line 16, in init
super(ICNet, self).init(nclass,backbone, pretrained_base=pretrained_base)
File "/home/abc/guoanXu/ICNet-master/models/segbase.py", line 22, in init
self.pretrained = resnet50_v1s(pretrained=pretrained_base, dilated=dilated, **kwargs)
File "/home/abc/guoanXu/ICNet-master/models/base_models/resnetv1b.py", line 239, in resnet50_v1s
model.load_state_dict(torch.load(get_resnet_file('resnet50', root=root)), strict=False)
File "/home/abc/guoanXu/ICNet-master/models/model_store.py", line 51, in get_resnet_file
overwrite=True)
File "/home/abc/guoanXu/ICNet-master/utils/download.py", line 68, in download
raise RuntimeError("Failed downloading url %s"%url)
RuntimeError: Failed downloading url https://hangzh.s3.amazonaws.com/encoding/models/resnet50-25c4b509.zip
how to solve it please?

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

when i run train.py,it shows this problem suddenly.what should i do?

Question about reproduce 71mIOU

Thanks for Amazing code . When reproduce your mIOU encounter some problem:
1-when Achieving 71 mIOU, do you train from imageNet pretrain model?
1- Can not loading pretrain model from "models/model_store.py" because of broken link. so I use pytorch offical model instead. but some parameter can not be load due to "deep_stem" flag in ResNet model. So I set deep_stem=False. Will this hurt performance?

I use pytorch1.4 with cuda10. Currently achieve mIOU 66.1 with ResNet50 backbone

Run prediction for image in /demo directory

Hello, I'm newbie and i am very interested in ICNet and I saw your github about that
I read the read.me and I wanna ask how could i run the code for 1 image test, because i don't see the instruction about how to run code for image (command to run ??)
Thanks a lot

ResNet101

sir, have you ever tried ResNet101-2a57e44d.pth（or other pth based on ResNet101）? seems like it is not suitable for this mode.
Also this url "https://hangzh.s3.amazonaws.com/" is invalid, we can't downloading the file from this url.

请问这个可以做视频分割吗，没看见有代码啊

hi Could you upload the 0.71 pre training model again

Could you upload the 0.71 pre training model again 。The link is invalid！ thank you！

what should i do if i want to test it on my own datasets

After training with CityScapes data, what should I change if I want to use my own data images to test the effect

Question about sub2 and sub4 share three phases convolution.

Thank you for your codes.
I run your code and get a better results,but when I check the code, I am confused ahout the following code:
# sub 2
x_sub2 = F.interpolate(x, scale_factor=0.5, mode='bilinear', align_corners=True)
_, x_sub2, _, _ = self.base_forward(x_sub2)
# sub 4
x_sub4 = F.interpolate(x, scale_factor=0.25, mode='bilinear', align_corners=True)
_, _, _, x_sub4 = self.base_forward(x_sub4)
In the paper and your discussion , it explains that sub2 and sub4 will share the weights and computation, but in your code, you just put the 1/2 and 1/4 image to the basemodel, am I right?
how do your share the weights and convolution or how do your understand the definition of the share.

the download link of pretrained model seems to be invalid

hi,
the download link of pretrained model seems to be invalid ,
could you provide a new link for us to use the pretrained model,
thanks a lot!

Could you please put the pretrained model weights Link?

Could you please put the pretrained model weights Link? I want to use for more experiments, hope for your reply.

.pth

trained model

您好，您在cityscape上训练好的模型的链接好像挂掉了
您看方便重新上传一下嘛，万分感谢

How can i get the pretrained model?

How can i get the pretrained model

Ignore label question，标签处理问题

Cityscapes dataset, I changed the ignored label from the original -1 to 255 in original script and Correspondingly changed the loss function's ignore label, but meet errors.
为什么处理cityscapes数据集时把忽略的标签都设为-1，在计算损失函数时resize用的双线性差值不报错。当我把忽略标签改为255，计算损失函数就会报错，把插值方式改为近邻插值就没事了。

Number of parameters

The number of parameters for your implemented model is 28298376, which is much more than the official.