thuyngch / human-segmentation-pytorch Goto Github PK

View Code? Open in Web Editor NEW

549.0 11.0 115.0 4.75 MB

Human segmentation models, training/inference code, and trained weights, implemented in PyTorch

Python 5.63% Jupyter Notebook 94.37%

semantic-segmentation deep-learning pytorch portrait-segmentation unet deeplab bisenet pspnet icnet human-segmentation

human-segmentation-pytorch's Introduction

Human-Segmentation-PyTorch

Human segmentation models, training/inference code, and trained weights, implemented in PyTorch.

Supported networks

UNet: backbones MobileNetV2 (all aphas and expansions), ResNetV1 (all num_layers)
DeepLab3+: backbones ResNetV1 (num_layers=18,34,50,101), VGG16_bn
BiSeNet: backbones ResNetV1 (num_layers=18)
PSPNet: backbones ResNetV1 (num_layers=18,34,50,101)
ICNet: backbones ResNetV1 (num_layers=18,34,50,101)

To assess architecture, memory, forward time (in either cpu or gpu), numper of parameters, and number of FLOPs of a network, use this command:

python measure_model.py

Dataset

Portrait Segmentation (Human/Background)

Automatic Portrait Segmentation for Image Stylization: 1800 images
Supervisely Person: 5711 images

Set

Python3.6.x is used in this repository.
Clone the repository:

git clone --recursive https://github.com/AntiAegis/Human-Segmentation-PyTorch.git
cd Human-Segmentation-PyTorch
git submodule sync
git submodule update --init --recursive

To install required packages, use pip:

workon humanseg
pip install -r requirements.txt
pip install -e models/pytorch-image-models

Training

For training a network from scratch, for example DeepLab3+, use this command:

python train.py --config config/config_DeepLab.json --device 0

where config/config_DeepLab.json is the configuration file which contains network, dataloader, optimizer, losses, metrics, and visualization configurations.

For resuming training the network from a checkpoint, use this command:

python train.py --config config/config_DeepLab.json --device 0 --resume path_to_checkpoint/model_best.pth

One can open tensorboard to monitor the training progress by enabling the visualization mode in the configuration file.

Inference

There are two modes of inference: video and webcam.

python inference_video.py --watch --use_cuda --checkpoint path_to_checkpoint/model_best.pth
python inference_webcam.py --use_cuda --checkpoint path_to_checkpoint/model_best.pth

Benchmark

Networks are trained on a combined dataset from the two mentioned datasets above. There are 6627 training and 737 testing images.
Input size of model is set to 320.
The CPU and GPU time is the averaged inference time of 10 runs (there are also 10 warm-up runs before measuring) with batch size 1.
The mIoU is measured on the testing subset (737 images) from the combined dataset.
Hardware configuration for benchmarking:

CPU: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
GPU: GeForce GTX 1050 Mobile, CUDA 9.0

Model	Parameters	FLOPs	CPU time	GPU time	mIoU
UNet_MobileNetV2 (alpha=1.0, expansion=6)	4.7M	1.3G	167ms	17ms	91.37%
UNet_ResNet18	16.6M	9.1G	165ms	21ms	90.09%
DeepLab3+_ResNet18	16.6M	9.1G	133ms	28ms	91.21%
BiSeNet_ResNet18	11.9M	4.7G	88ms	10ms	87.02%
PSPNet_ResNet18	12.6M	20.7G	235ms	666ms	---
ICNet_ResNet18	11.6M	2.0G	48ms	55ms	86.27%

human-segmentation-pytorch's People

Contributors

Stargazers

Watchers

Forkers

lijiunderstand alwc wpf535236337 hmzjwhmzjw arthur151 chaoshen0 ckun5 magicsen leonzfa upgirlnana chl916185 xiangliu886 iloveoreo nelaturuharsha huma97 amirgoren human2b jholee zudva lucaslu1987 onikazu h-arshit ml-and-ai-repo sjingwen askintution mc12d barbecacov ivan-v-kush eddieback jcolares yangshiyu89 jiewei119 joshualee1983 livinter baodijun orlgln jtressle pinglmlcv caozhengquan insad freegliboracle wuxiaolianggit cv-ip hongchow ragnariock awalesajil sljlp bobodalao huyuejingling jiayusun jinfei3459 jzx-gooner fenglian425 herbiezhao jrobchin githubbingochen pandinosaurus xiuyangleiasp xafha zhaoyk1986 andrewpeng02 wchen-casia kyhoolee boxoq-rex zardzen chuanli11 shubhnirbhay ysnan deephog willtao-rd yaphabates chaoxiang661 sotashe alice0807 zhu-zhaofei hzq-zjm jizs hasansarman tanyinyan primecai shuaishentao silencesss zhaohanlin liuzhenzhen-ls khaki01 kuahuo nadiawangberg dungmn shizidushu xzavierlovescoding angelomorgado liaozihzrong serissa telefire deepshwang liaochiheng yuntaozhu mohamedhamayed cyruszhou-cn yunzed

human-segmentation-pytorch's Issues

Data process

Does anyone know how to process the data?

submodule clone error

https://github.com/AntiAegis/pytorch-image-models is private repo. I cannot clone the repo. Therefore, I am not able to run the scripts.

Traceback (most recent call last): File "inference_video.py", line 49, in <module> H, W = frame.shape[:2] AttributeError: 'NoneType' object has no attribute 'shape'

When I am running this command: python inference_video.py --watch --use_cuda --checkpoint path_to_checkpoint/model_best.pth
I get this error. How do I resolve it?

Training from scratch fails

I am having this issue when running training from scratch

Traceback (most recent call last):
  File "train.py", line 101, in <module>
    main(config, args.resume)
  File "train.py", line 55, in main
    trainer.train()
  File "/path/Human-Segmentation-PyTorch/base/base_trainer.py", line 95, in train
    result = self._train_epoch(epoch)
  File "/path/Human-Segmentation-PyTorch/trainer/trainer.py", line 81, in _train_epoch
    loss = self.loss(output, target)
  File "/path/Human-Segmentation-PyTorch/evaluation/losses.py", line 18, in dice_loss
    targets = torch.zeros_like(logits).scatter_(dim=1, index=targets.type(torch.int64), src=torch.tensor(1.0))
RuntimeError: Index tensor must have the same number of dimensions as src tensor

I guess I might have an issue in the dataset labeling. What is the correct format? I used an image of the same size for the original image and have 1 channel (I have 2 classes), so that is a mask of 1 channel containing 0 and 1 (or 0 and 255)

error test video

/Human-Segmentation-PyTorch$ python inference_video.py --watch --checkpoint ./checkpoint/UNet_ResNet18.pth
Traceback (most recent call last):
File "inference_video.py", line 80, in
model.load_state_dict(trained_dict, strict=False)
File "/home/anaconda3/envs/humansegmentation/lib/python3.6/site-packages/torch/nn/modules/module.py", line 845, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for UNet:
size mismatch for decoder1.deconv.weight: copying a param with shape torch.Size([512, 256, 4, 4]) from checkpoint, the shape in current model is torch.Size([1280, 96, 4, 4]).
size mismatch for decoder1.deconv.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for decoder2.deconv.weight: copying a param with shape torch.Size([256, 128, 4, 4]) from checkpoint, the shape in current model is torch.Size([96, 32, 4, 4]).
size mismatch for decoder2.deconv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for decoder3.deconv.weight: copying a param with shape torch.Size([128, 64, 4, 4]) from checkpoint, the shape in current model is torch.Size([32, 24, 4, 4]).
size mismatch for decoder3.deconv.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([24]).
size mismatch for decoder4.deconv.weight: copying a param with shape torch.Size([64, 64, 4, 4]) from checkpoint, the shape in current model is torch.Size([24, 16, 4, 4]).
size mismatch for decoder4.deconv.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for conv_last.0.weight: copying a param with shape torch.Size([3, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 16, 3, 3]).

About downloading Supervisely dataset

Hi~ Thanks for your wonderful job.
I want to use your code to train but the download speed of Supervisely dataset(from supervise.ly) is very slow.
How you download this dataset or do you have a googledrive link of this dataset?
Thanks~

Help Support Portrait Segmentation (Human/Background) on MMSegmentation.

Hi, thanks for your nice work and enthusiasm of open-source.

We are members of MMSegmentation, its aim is to support more models and datasets for convenient usage by communities. Because of limited human sources, we hope communities could also join us. Would you like to support this Portrait Segmentation dataset like this customization pipeline, which would also in turn enlarge the popularity of your excellent work?

Best,

Bounding Boxes

Is there any way to get bbox values?

question about inference_video.py--input_sz?

Thank you for your work.What is the meaning of the input_sz parameter? Why do you want to do padding=0?

error: python inference_video.py --watch --checkpoint ./checkpoint/UNet_ResNet18.pth

dear,
~/Human-Segmentation-PyTorch$ python inference_video.py --watch --checkpoint ./checkpoint/UNet_ResNet18.pth
OpenCV: FFMPEG: tag 0x5634504d/'MP4V' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)'
OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'
Traceback (most recent call last):
File "inference_video.py", line 80, in
model.load_state_dict(trained_dict, strict=False)
File "/home/anaconda3/envs/humansegmentation/lib/python3.6/site-packages/torch/nn/modules/module.py", line 845, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for UNet:
size mismatch for decoder1.deconv.weight: copying a param with shape torch.Size([512, 256, 4, 4]) from checkpoint, the shape in current model is torch.Size([1280, 96, 4, 4]).
size mismatch for decoder1.deconv.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for decoder2.deconv.weight: copying a param with shape torch.Size([256, 128, 4, 4]) from checkpoint, the shape in current model is torch.Size([96, 32, 4, 4]).
size mismatch for decoder2.deconv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for decoder3.deconv.weight: copying a param with shape torch.Size([128, 64, 4, 4]) from checkpoint, the shape in current model is torch.Size([32, 24, 4, 4]).
size mismatch for decoder3.deconv.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([24]).
size mismatch for decoder4.deconv.weight: copying a param with shape torch.Size([64, 64, 4, 4]) from checkpoint, the shape in current model is torch.Size([24, 16, 4, 4]).
size mismatch for decoder4.deconv.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for conv_last.0.weight: copying a param with shape torch.Size([3, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 16, 3, 3]).

Can you show me what is error?

ERROR: File "setup.py" not found. Directory cannot be installed in editable mode: /content/drive/My Drive/Human-Segmentation-PyTorch/models/pytorch-image-models

Model Size

Hi, thank you for your great work. I downloaded your model on Google dirver and found that the model size is much larger than the model size written in readme. Is this normal?

pretrained models

hello, thanks for your work. Could you provide google drive url for your pretrained models? thanks

Video portrait segmentation is not unstable，often flickering when use unet model

Video portrait segmentation is not unstable， often flickering when use unet model，and loss used crossentropy. i am wonder I what causes the flicker and whether it is related to training loss。

I am looking forward to answer。

models.backbonds.EfficientNet is not a package

ModuleNotFoundError: No module named 'timm.models.gen_efficientnet'

get such error,
installed timm using pip3 install timm

run python3 inference_webcam.py --checkpoint UNet_MobileNetV2.pth

which paper is this project reimplementing?

which paper is this project reimplementing? Thanks.

error after 1 epoch

when i run train.py unet_config
after trained 1 epoch,error shows:

Traceback
train.py line 101
trainer.train()
base_trainer.py line95
result = self.train_epoch(epoch)
trainer/trainer.py line98
self.writer_train.add_image('train/output',make_grid(.....))
torchvision/utils.py ,line66, in make_grid
norm_range(tensor,range)
torchvision/utils.py ,line60, in norm_range
norm_ip(t,t.min,max=max)
torchvision/utils.py ,line53, in norm_ip
img.clamp_(argument 'min' must be Number ,not tensor)

can you help me ,my own dataset input is rgb ,mask is 1 channel (0,1)

Can't download the trained weights!

There is an error when I want to download the pretrained weights!

if any one can provide a download link?

voc test

HI, have you done this project in VOC2012 datasets? Something is wrong with this datasets.

About the "ResNet" backbones weights

Thanks for you work!
I have two questions:

Where to get the "ResNet" backbones pretrained weights?

 def resnet50(pretrained=None, **kwargs):
  model = ResNet(Bottleneck, [3, 4, 6, 3], **kwargs)
  if pretrained is not None:
  	model._load_pretrained_model(pretrained)
  return model

where is the definition of function _load_pretrained_model?

Where is your models/pytorch-image-models submodule?

"Page not found" showed on GIthub.

About the pretrained model

Hi, I just wonder where can i get the pretrained models, whose paths in the json files are like:
"pretrained_backbone": "/data4/livesegmentation/thuync/PyTorch-pretrained/resnet18.pth"
or
"pretrained_backbone": "/root/pretrain/resnet18.pth"
or
"/workspace/pretrain/resnet18.pth"?

pytorch-image-models @ 2be1fd0 not accessible

After ran the command
$ git submodule update --init --recursive

It stuck without response. I checked the repo that "pytorch-image-models @ 2be1fd0 " pointed to a webpage with 404 error( not found).

About IOU of different models

Thanks for your great work!
In the table at the bottom of readme, results of different models are showed, and I've noticed that Unet-Mobilenet get the best iou rather than models that have much more parameters. What do you think have caused this? Thanks!

LICENSE?

Hello. Thank you very much for sharing the code and model.

What is the license for this project and the pre-training model? Can I use it in commercial projects?

I really need your reply.