Image Segmentation and Object Detection in Pytorch

License: MIT License

Python 2.47% Jupyter Notebook 97.44% CMake 0.02% C++ 0.07%

pytorch-segmentation-detection's Introduction

Image Segmentation and Object Detection in Pytorch

Pytorch-Segmentation-Detection is a library for image segmentation and object detection with reported results achieved on common image segmentation/object detection datasets, pretrained models and scripts to reproduce them.

Segmentation

PASCAL VOC 2012

Implemented models were tested on Restricted PASCAL VOC 2012 Validation dataset (RV-VOC12) or Full PASCAL VOC 2012 Validation dataset (VOC-2012) and trained on the PASCAL VOC 2012 Training data and additional Berkeley segmentation data for PASCAL VOC 12.

You can find all the scripts that were used for training and evaluation here.

This code has been used to train networks with this performance:

Model	Test data	Mean IOU	Mean pix. accuracy	Pixel accuracy	Inference time (512x512 px. image)	Model Download Link	Related paper
Resnet-18-8s	RV-VOC12	59.0	in prog.	in prog.	28 ms.	Dropbox	DeepLab
Resnet-34-8s	RV-VOC12	68.0	in prog.	in prog.	50 ms.	Dropbox	DeepLab
Resnet-50-16s	VOC12	66.5	in prog.	in prog.	in prog.	in prog.	DeepLab
Resnet-50-8s	VOC12	67.0	in prog.	in prog.	in prog.	in prog.	DeepLab
Resnet-50-8s-deep-sup	VOC12	67.1	in prog.	in prog.	in prog.	in prog.	DeepLab
Resnet-101-16s	VOC12	68.6	in prog.	in prog.	in prog.	in prog.	DeepLab
PSP-Resnet-18-8s	VOC12	68.3	n/a	n/a	n/a	in prog.	PSPnet
PSP-Resnet-50-8s	VOC12	73.6	n/a	n/a	n/a	in prog.	PSPnet

Some qualitative results:

Endovis 2017

Implemented models were trained on Endovis 2017 segmentation dataset and the sequence number 3 was used for validation and was not included in training dataset.

The code to acquire the training and validating the model is also provided in the library.

Additional Qualitative results can be found on this youtube playlist.

Binary Segmentation

Model	Test data	Mean IOU	Mean pix. accuracy	Pixel accuracy	Inference time (512x512 px. image)	Model Download Link
Resnet-9-8s	Seq # 3 *	96.1	in prog.	in prog.	13.3 ms.	Dropbox
Resnet-18-8s	Seq # 3	96.0	in prog.	in prog.	28 ms.	Dropbox
Resnet-34-8s	Seq # 3	in prog.	in prog.	in prog.	50 ms.	in prog.

Resnet-9-8s network was tested on the 0.5 reduced resoulution (512 x 640).

Qualitative results (on validation sequence):

Multi-class Segmentation

Model	Test data	Mean IOU	Mean pix. accuracy	Pixel accuracy	Inference time (512x512 px. image)	Model Download Link
Resnet-18-8s	Seq # 3	81.0	in prog.	in prog.	28 ms.	Dropbox
Resnet-34-8s	Seq # 3	in prog.	in prog.	in prog.	50 ms.	in prog

Qualitative results (on validation sequence):

Cityscapes

The dataset contains video sequences recorded in street scenes from 50 different cities, with high quality pixel-level annotations of 5 000 frames. The annotations contain 19 classes which represent cars, road, traffic signs and so on.

Model	Test data	Mean IOU	Mean pix. accuracy	Pixel accuracy	Inference time (512x512 px. image)	Model Download Link
Resnet-18-32s	Validation set	61.0	in prog.	in prog.	in prog.	in prog.
Resnet-18-8s	Validation set	60.0	in prog.	in prog.	28 ms.	Dropbox
Resnet-34-8s	Validation set	69.1	in prog.	in prog.	50 ms.	Dropbox
Resnet-50-16s-PSP	Validation set	71.2	in prog.	in prog.	in prog.	in prog.

Qualitative results (on validation sequence):

Whole sequence can be viewed here.

Installation

This code requires:

Pytorch.
Some libraries which can be acquired by installing Anaconda package.

Or you can install scikit-image, matplotlib, numpy using pip.
Clone the library:

git clone --recursive https://github.com/warmspringwinds/pytorch-segmentation-detection

And use this code snippet before you start to use the library:

import sys
# update with your path
# All the jupyter notebooks in the repository already have this
sys.path.append("/your/path/pytorch-segmentation-detection/")
sys.path.insert(0, '/your/path/pytorch-segmentation-detection/vision/')

Here we use our pytorch/vision fork, which might be merged and futher merged in a future. We have added it as a submodule to our repository.

Download segmentation or detection models that you want to use manually (links can be found below).

About

If you used the code for your research, please, cite the paper:

@article{pakhomov2017deep,
  title={Deep Residual Learning for Instrument Segmentation in Robotic Surgery},
  author={Pakhomov, Daniil and Premachandran, Vittal and Allan, Max and Azizian, Mahdi and Navab, Nassir},
  journal={arXiv preprint arXiv:1703.08580},
  year={2017}
}

During implementation, some preliminary experiments and notes were reported:

pytorch-segmentation-detection's People

Contributors

Stargazers

Watchers

Forkers

arasharchor tianzq dimplesl allensmile benjamesbabala lightsilver gninnur birdylinch mahlermozart caxton kitter qjin2016 zhangweiqiang285 guo2004131 rongchangzhao gvc0461082002 landyliuqiqi karolmajek mdashrafulalam sinianyutian peteflorence bestlin alxndrkalinin damienstanton liujie3948 prlz77 sapjunior sharanry jizongfox alanmorninglight isi-research sureshkumar0707 alexliyang xiahaifeng1995 gaurav104 randl blankworld zgsxwsdxg b2220333 peterxiaoguo dxqjean hibiscuses dsp6414 dr-sarvamangala briando2005 kevinlee9 bjfuzhao shiutang-li qberto christinaliang suvo240 kanbo0409 naykira ztm19950303 afcarl netphone yu45020 irfanicmll shannonz tglubenov dacer250 alliedel insigh chungbrain anujonthemove codes-kzhan rexgshan erasaur kapulkin suyanzhou626 manhcuongk55 roberto-hg karnian xiong224 shivangbaveja hanzmyco dhruvramani gh-reshad wang122300090 jinlmsft medinal2000 yui34567 shathe qguid simranjit2112 tyunist sudohello yuxiwang93 shansanchu balrajashwath lcj1105 rambosofter amirunpri2018 juliussuryas anders617 rameezrehman83 gaoshf gumballing ylqi linhduongtuan

pytorch-segmentation-detection's Issues

Changing the number of classes

Hi there,

First, thanks a lot for the good work, it's really useful!

I am trying to train the model on 1 only one class (that class + background) using the code in resnet_34_8s_train.ipynb in a .py file . I am confident my dataset only has one class, so I change the number of class from 21 to 2, but I get the following error after when starting the first iteration:

RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1518238441757/work/torch/lib/THC/generic/THCStorage.cu:58

I just wanted to make sure that for only 1 class, I should set number_of_classes = 2 instead of 21, and that you are able to make work with you home code work with a different number of classes? The full error is below:

  File "<ipython-input-1-574834e79b43>", line 1, in <module>
    runfile('/home/ft_fcnpt/pytorch-segmentation-detection-master/pytorch_segmentation_detection/recipes/pascal_voc/segmentation/py_version2.py', wdir='/home/john/ft_fcnpt/pytorch-segmentation-detection-master/pytorch_segmentation_detection/recipes/pascal_voc/segmentation')

  File "/home/anaconda3/envs/pt27/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py", line 705, in runfile
    execfile(filename, namespace)

  File "/home/anaconda3/envs/pt27/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py", line 94, in execfile
    builtins.execfile(filename, *where)

  File "/home/ft_fcnpt/pytorch-segmentation-detection-master/pytorch_segmentation_detection/recipes/pascal_voc/segmentation/py_version2.py", line 280, in <module>
    loss.backward()

  File "/home/anaconda3/envs/pt27/lib/python2.7/site-packages/torch/autograd/variable.py", line 167, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)

  File "/home/anaconda3/envs/pt27/lib/python2.7/site-packages/torch/autograd/__init__.py", line 99, in backward
    variables, grad_variables, retain_graph)

  File "/home/anaconda3/envs/pt27/lib/python2.7/site-packages/torch/autograd/function.py", line 91, in apply
    return self._forward_cls.backward(self, *args)

  File "/home/anaconda3/envs/pt27/lib/python2.7/site-packages/torch/nn/_functions/thnn/upsampling.py", line 283, in backward
    grad_input = UpsamplingBilinear2dBackward.apply(grad_output, ctx.input_size, ctx.output_size)

  File "/home/anaconda3/envs/pt27/lib/python2.7/site-packages/torch/nn/_functions/thnn/upsampling.py", line 296, in forward
    grad_output = grad_output.contiguous()

I want to train on my own custom dataset.What is the process of doing that?What should be the format of annotations?

Incompatible ResNet arguments

While executing any of the notebooks in: pytorch_segmentation_detection/recipes/pascal_voc/segmentation/*.ipynb
the errors relate to initialization regarding unknown arguments for ResNet object init. Tried on multiple cloud / GPU setups with same output. Maybe there are unchecked files (there could be cached files in your setups. On a clean repo pull, these errors might occur for you as well)
[python2.7, pytorch-0.3.1]:

TypeErrorTraceback (most recent call last)
<ipython-input-1-a4cbdc8e5706> in <module>()
     36 
     37 print(torch.__version__)
---> 38 fcn = resnet_dilated.Resnet34_8s(num_classes=21)
     39 fcn.load_state_dict(torch.load('resnet_34_8s_68.pth'))
     40 #fcn.load_state_dict(torch.load('resnet34-333f7ec4.pth'))

/models/pytorch-segmentation-detection/pytorch_segmentation_detection/models/resnet_dilated.py in __init__(self, num_classes)
    290         # Load the pretrained weights, remove avg pool
    291         # layer and get the output stride of 8
--> 292         resnet34_8s = models.resnet34(fully_conv=True, pretrained=True, output_stride=8, remove_avg_pool_layer=True)
    293         #resnet34_8s = models.resnet34(pretrained=True)
    294 

/usr/local/lib/python2.7/dist-packages/torchvision/models/resnet.pyc in resnet34(pretrained, **kwargs)
    172         pretrained (bool): If True, returns a model pre-trained on ImageNet
    173     """
--> 174     model = ResNet(BasicBlock, [3, 4, 6, 3], **kwargs)
    175     if pretrained:
    176         model.load_state_dict(model_zoo.load_url(model_urls['resnet34']))

TypeError: __init__() got an unexpected keyword argument 'fully_conv'

@warmspringwinds Could you please put in the requirements.txt file as well?
Will clear out a lot of confusions.

pip freeze > requirements.txt

Also, if you could also put in the resnet_34_8_66.pth, it will be helpful to execute the code without changes in resnet_34_8s_demo.ipynb

Many thanks!

Could you please add a license file

Hi, I found your flops benchmark quite useful. Is it possible to add a license file to your project, (may be similar to the one you have used in another project) so that I can cite your work cleanly if I use your code?

https://github.com/warmspringwinds/tf-image-segmentation/blob/master/LICENSE

RuntimeError: value cannot be converted to type float without overflow

Hi, I try to train model using python 3, but I got below issue:

/home/v2m/anaconda3/envs/my_env3/lib/python3.7/site-packages/torch/nn/_reduction.py:46: UserWarning: size│
_average and reduce args will be deprecated, please use reduction='sum' instead.                         │
  warnings.warn(warning.format(ret))                                                                     │
/home/v2m/anaconda3/envs/my_env3/lib/python3.7/site-packages/torch/nn/functional.py:2622: UserWarning: nn│
.functional.upsample_bilinear is deprecated. Use nn.functional.interpolate instead.                      │
  warnings.warn("nn.functional.upsample_bilinear is deprecated. Use nn.functional.interpolate instead.") │
0.4247354666311015                                                                                       │
0.5617590819998219                                                                                       │
0.5815541637890524                                                                                       │
0.6344758887881029                                                                                       │
Traceback (most recent call last):                                                                       │
  File "pytorch_segmentation_detection/recipes/pascal_voc/segmentation/psp_resnet_50_8s_train.py", line 3│
76, in <module>                                                                                          │
    optimizer.step()                                                                                     │
  File "/home/v2m/anaconda3/envs/my_env3/lib/python3.7/site-packages/torch/optim/adam.py", line 107, in s│
tep                                                                                                      │
    p.data.addcdiv_(-step_size, exp_avg, denom)                                                          │
RuntimeError: value cannot be converted to type float without overflow: (3.52033e-08,-1.14383e-08)

Can someone give me suggestion?

Optimizer for unet model on Pascal Voc segmentation

Hello,
Can I know the optimizer and its specifications to use on unet model on Pascal Voc segmentation using FOCAL loss ?
Should I have to use any learning rate schedulers?
Also, is it better to take mean focal loss or sum focal loss ?
I am training the model from scratch.

fully_conv in vgg16

Very good repository.

How did you make this work ? it seems that vgg16 does not have the fully_conv keyword in torchvision

vgg16 = models.vgg16(pretrained=True,
                            fully_conv=True)

RuntimeError: Error(s) in loading_state_dict for VGG

By using your fork of torchvision and default installation of pytorch for Linux-Python3.6-CUDA10:

init_weights argument in the class VGG was missing.
After fixing (1), the following error was generated:

In [1]: from torchvision import models                                                                               
In [2]: model = models.vgg16(pretrained=True, fully_conv=True)                                                       

RuntimeError                              Traceback (most recent call last)
<ipython-input-2-802ee77a237c> in <module>
----> 1 model = models.vgg16(pretrained=True, fully_conv=True)

~/repositories/github/pytorch-segmentation-detection/vision/torchvision/models/vgg.py in vgg16(pretrained, **kwargs)
    164     model = VGG(make_layers(cfg['D']), **kwargs)
    165     if pretrained:
--> 166         model.load_state_dict(model_zoo.load_url(model_urls['vgg16']))
    167     return model
    168 

~/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
    767         if len(error_msgs) > 0:
    768             raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
--> 769                                self.__class__.__name__, "\n\t".join(error_msgs)))
    770 
    771     def _named_members(self, get_members_fn, prefix='', recurse=True):

RuntimeError: Error(s) in loading state_dict for VGG:
	size mismatch for classifier.0.weight: copying a param with shape torch.Size([4096, 25088]) from checkpoint, the shape in current model is torch.Size([4096, 512, 7, 7]).
	size mismatch for classifier.3.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([4096, 4096, 1, 1]).
	size mismatch for classifier.6.weight: copying a param with shape torch.Size([1000, 4096]) from checkpoint, the shape in current model is torch.Size([1000, 4096, 1, 1]).

No such file or directory: 'resnet_34_8s_66.pth'

Hi @warmspringwinds
I am getting - IOError: [Errno 2] No such file or directory: 'resnet_34_8s_66.pth'
when running resnet_34_8s_demo.ipynb

Not able to figure out when and how the model will generate. Any clue on the next step .

Thanks
Akash

Train on my own dataset without superpixels

First of all, great work!
I assume that in order to use it I have to train the weights on a dataset with superpixels annotations, or at least some boundry boxes.
Am I right?
What do you suggest me to do if now I have lots of clean videos which I want to make segmentation on some features that are moving in the video?
Should I start with simple boundary boxes classification and only than proceed? or that there is any shortcut for that?

Thanks

Error when trying to run resnet_34_8s_test

It says

TypeError Traceback (most recent call last)
in ()
34 img = Variable(img.cuda())
35
---> 36 fcn = resnet_dilated.Resnet34_8s(num_classes=19)
37 fcn.load_state_dict(torch.load('/home/sawyer/workspace/segmentation/resnet_34_8s_cityscapes_best.pth'))
38 fcn.cuda()

/home/sawyer/workspace/segmentation/pytorch-segmentation-detection/pytorch_segmentation_detection/models/resnet_dilated.pyc in init(self, num_classes)
293 pretrained=True,
294 output_stride=8,
--> 295 remove_avg_pool_layer=True)
296
297 # Randomly initialize the 1x1 Conv scoring layer

/home/sawyer/workspace/segmentation/pytorch-segmentation-detection/vision/torchvision/models/resnet.pyc in resnet34(pretrained, **kwargs)
172 pretrained (bool): If True, returns a model pre-trained on ImageNet
173 """
--> 174 model = ResNet(BasicBlock, [3, 4, 6, 3], **kwargs)
175 if pretrained:
176 model.load_state_dict(model_zoo.load_url(model_urls['resnet34']))

TypeError: init() got an unexpected keyword argument 'fully_conv'

In both python2.7-3.5
torch version 1.0.1

Error(s) in loading state_dict for Resnet18_8s

@warmspringwinds I am getting the following error for resnet_18_8s_59.pth

RuntimeError: Error(s) in loading state_dict for Resnet18_8s:
size mismatch for resnet18_8s.fc.bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([21]).
size mismatch for resnet18_8s.fc.weight: copying a param with shape torch.Size([2, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([21, 512, 1, 1]).

If I change num_classes=21 to num_classes=2, it generates the output without any segmentation (purple screen)

new model implementation

I would like to test a new model. Is there a walk through of what i would need to do if I want to test a new model with your system?

about Resnet18_8s

Hello, I am very impressed by your great work! However, I am a little confused when I look at your Resnet18_8s network. I assume Resnet18_8s follows your approach in your paper "Deep Residual Learning for Instrument Segmentation in Robotic Surgery", which employ dilated convolutions to keep resolution. But in resnet_dilated.py, I could not find any dilated convolutions in Class Resnet18_8s. Could you please give more detailed explanation on the structure of Resnet18_8s? Many thanks.

adaptive_computation_time

Hi Daniil,
I'm trying to work on image segmentation of microscopy images using pytorch.
I've been trying to work with your examples.
But i'm having error on resnet_34_8s_train.

ImportError: No module named adaptive_computation_time

I wonder if it's something in the anaconda?

Difference Between Semantic Segmentation and Image Classification

I'm new to implementing CNNs and I'm trying to understand how a model knows whether to perform semantic classification (pixelwise) or image classification (one class per image). As far as I can see, the only difference is in the models/resnet_dilated.py file in the lines
resnet34_8s.fc = nn.Conv2d(resnet34_8s.inplanes, num_classes, 1)

whereas most other codes have it as
resnet34_8s.fc = nn.Conv2d(resnet34_8s.fc.in_features, num_classes)

Is this the difference between returning a logits of shape [batch x num_classes x H x W] and [batch x num_classes]?

about image size of training set

Hello Daniil,
In the training process of your ResNet-8s, I notice that you crop all training images to 224x224 (RandomCropJoint(crop_size=(224, 224))). But you didn't adopt this approach when you train your FCN-32s model. Is it because the ResNet pretrained model is used as initial weights so we need to comply with its input image size (224x224) too? Do you think other input size can be used for training, without causing accuracy decline? Please advice. Thanks.

FCN Skip Connections

Hi, FCN-8s/16s (regardless of the base model being VGG/ResNet) should have skip connections for aggregating the features from pooling layers. But, I can't seem to find these in your model definitions.

Unable to run resnet_34_8s_demo.ipynb

TypeError: torch.FloatTensor constructor received an invalid combination of arguments - got (int, int, numpy.int64, numpy.int64), but expected one of:
 * no arguments
 * (int ...)
      didn't match because some of the arguments have invalid types: (int, int, numpy.int64, numpy.int64)
 * (torch.FloatTensor viewed_tensor)
 * (torch.Size size)
 * (torch.FloatStorage data)
 * (Sequence data)

Which version of pytorch is to be used?

CRF implementation

Hi! Thank you for sharing your code :)
Where can I find CRF implementation in this repo? I couldn't find any result when searching with 'crf' keyword in this repo.

warmspringwinds / pytorch-segmentation-detection Goto Github PK

pytorch-segmentation-detection's Introduction

Image Segmentation and Object Detection in Pytorch

Segmentation

PASCAL VOC 2012

Endovis 2017

Binary Segmentation

Multi-class Segmentation

Cityscapes

Installation

About

pytorch-segmentation-detection's People

Contributors

Stargazers

Watchers

Forkers

pytorch-segmentation-detection's Issues

It says

Recommend Projects

Recommend Topics

Recommend Org