Image Segmentation and Object Detection in Pytorch

License: MIT License

Python 2.47% Jupyter Notebook 97.44% CMake 0.02% C++ 0.07%

pytorch-segmentation-detection's Issues

adaptive_computation_time

Hi Daniil,
I'm trying to work on image segmentation of microscopy images using pytorch.
I've been trying to work with your examples.
But i'm having error on resnet_34_8s_train.

ImportError: No module named adaptive_computation_time

I wonder if it's something in the anaconda?

Could you please add a license file

Hi, I found your flops benchmark quite useful. Is it possible to add a license file to your project, (may be similar to the one you have used in another project) so that I can cite your work cleanly if I use your code?

https://github.com/warmspringwinds/tf-image-segmentation/blob/master/LICENSE

Unable to run resnet_34_8s_demo.ipynb

TypeError: torch.FloatTensor constructor received an invalid combination of arguments - got (int, int, numpy.int64, numpy.int64), but expected one of:
 * no arguments
 * (int ...)
      didn't match because some of the arguments have invalid types: (int, int, numpy.int64, numpy.int64)
 * (torch.FloatTensor viewed_tensor)
 * (torch.Size size)
 * (torch.FloatStorage data)
 * (Sequence data)

Which version of pytorch is to be used?

Difference Between Semantic Segmentation and Image Classification

I'm new to implementing CNNs and I'm trying to understand how a model knows whether to perform semantic classification (pixelwise) or image classification (one class per image). As far as I can see, the only difference is in the models/resnet_dilated.py file in the lines
resnet34_8s.fc = nn.Conv2d(resnet34_8s.inplanes, num_classes, 1)

whereas most other codes have it as
resnet34_8s.fc = nn.Conv2d(resnet34_8s.fc.in_features, num_classes)

Is this the difference between returning a logits of shape [batch x num_classes x H x W] and [batch x num_classes]?

Error when trying to run resnet_34_8s_test

It says

TypeError Traceback (most recent call last)
in ()
34 img = Variable(img.cuda())
35
---> 36 fcn = resnet_dilated.Resnet34_8s(num_classes=19)
37 fcn.load_state_dict(torch.load('/home/sawyer/workspace/segmentation/resnet_34_8s_cityscapes_best.pth'))
38 fcn.cuda()

/home/sawyer/workspace/segmentation/pytorch-segmentation-detection/pytorch_segmentation_detection/models/resnet_dilated.pyc in init(self, num_classes)
293 pretrained=True,
294 output_stride=8,
--> 295 remove_avg_pool_layer=True)
296
297 # Randomly initialize the 1x1 Conv scoring layer

/home/sawyer/workspace/segmentation/pytorch-segmentation-detection/vision/torchvision/models/resnet.pyc in resnet34(pretrained, **kwargs)
172 pretrained (bool): If True, returns a model pre-trained on ImageNet
173 """
--> 174 model = ResNet(BasicBlock, [3, 4, 6, 3], **kwargs)
175 if pretrained:
176 model.load_state_dict(model_zoo.load_url(model_urls['resnet34']))

TypeError: init() got an unexpected keyword argument 'fully_conv'

In both python2.7-3.5
torch version 1.0.1

Optimizer for unet model on Pascal Voc segmentation

Hello,
Can I know the optimizer and its specifications to use on unet model on Pascal Voc segmentation using FOCAL loss ?
Should I have to use any learning rate schedulers?
Also, is it better to take mean focal loss or sum focal loss ?
I am training the model from scratch.

RuntimeError: Error(s) in loading_state_dict for VGG

By using your fork of torchvision and default installation of pytorch for Linux-Python3.6-CUDA10:

init_weights argument in the class VGG was missing.
After fixing (1), the following error was generated:

In [1]: from torchvision import models                                                                               
In [2]: model = models.vgg16(pretrained=True, fully_conv=True)                                                       

RuntimeError                              Traceback (most recent call last)
<ipython-input-2-802ee77a237c> in <module>
----> 1 model = models.vgg16(pretrained=True, fully_conv=True)

~/repositories/github/pytorch-segmentation-detection/vision/torchvision/models/vgg.py in vgg16(pretrained, **kwargs)
    164     model = VGG(make_layers(cfg['D']), **kwargs)
    165     if pretrained:
--> 166         model.load_state_dict(model_zoo.load_url(model_urls['vgg16']))
    167     return model
    168 

~/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
    767         if len(error_msgs) > 0:
    768             raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
--> 769                                self.__class__.__name__, "\n\t".join(error_msgs)))
    770 
    771     def _named_members(self, get_members_fn, prefix='', recurse=True):

RuntimeError: Error(s) in loading state_dict for VGG:
	size mismatch for classifier.0.weight: copying a param with shape torch.Size([4096, 25088]) from checkpoint, the shape in current model is torch.Size([4096, 512, 7, 7]).
	size mismatch for classifier.3.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([4096, 4096, 1, 1]).
	size mismatch for classifier.6.weight: copying a param with shape torch.Size([1000, 4096]) from checkpoint, the shape in current model is torch.Size([1000, 4096, 1, 1]).

RuntimeError: value cannot be converted to type float without overflow

Hi, I try to train model using python 3, but I got below issue:

/home/v2m/anaconda3/envs/my_env3/lib/python3.7/site-packages/torch/nn/_reduction.py:46: UserWarning: size│
_average and reduce args will be deprecated, please use reduction='sum' instead.                         │
  warnings.warn(warning.format(ret))                                                                     │
/home/v2m/anaconda3/envs/my_env3/lib/python3.7/site-packages/torch/nn/functional.py:2622: UserWarning: nn│
.functional.upsample_bilinear is deprecated. Use nn.functional.interpolate instead.                      │
  warnings.warn("nn.functional.upsample_bilinear is deprecated. Use nn.functional.interpolate instead.") │
0.4247354666311015                                                                                       │
0.5617590819998219                                                                                       │
0.5815541637890524                                                                                       │
0.6344758887881029                                                                                       │
Traceback (most recent call last):                                                                       │
  File "pytorch_segmentation_detection/recipes/pascal_voc/segmentation/psp_resnet_50_8s_train.py", line 3│
76, in <module>                                                                                          │
    optimizer.step()                                                                                     │
  File "/home/v2m/anaconda3/envs/my_env3/lib/python3.7/site-packages/torch/optim/adam.py", line 107, in s│
tep                                                                                                      │
    p.data.addcdiv_(-step_size, exp_avg, denom)                                                          │
RuntimeError: value cannot be converted to type float without overflow: (3.52033e-08,-1.14383e-08)

Can someone give me suggestion?

I want to train on my own custom dataset.What is the process of doing that?What should be the format of annotations?

Error(s) in loading state_dict for Resnet18_8s

@warmspringwinds I am getting the following error for resnet_18_8s_59.pth

RuntimeError: Error(s) in loading state_dict for Resnet18_8s:
size mismatch for resnet18_8s.fc.bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([21]).
size mismatch for resnet18_8s.fc.weight: copying a param with shape torch.Size([2, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([21, 512, 1, 1]).

If I change num_classes=21 to num_classes=2, it generates the output without any segmentation (purple screen)

about Resnet18_8s

Hello, I am very impressed by your great work! However, I am a little confused when I look at your Resnet18_8s network. I assume Resnet18_8s follows your approach in your paper "Deep Residual Learning for Instrument Segmentation in Robotic Surgery", which employ dilated convolutions to keep resolution. But in resnet_dilated.py, I could not find any dilated convolutions in Class Resnet18_8s. Could you please give more detailed explanation on the structure of Resnet18_8s? Many thanks.

fully_conv in vgg16

Very good repository.

How did you make this work ? it seems that vgg16 does not have the fully_conv keyword in torchvision

vgg16 = models.vgg16(pretrained=True,
                            fully_conv=True)

FCN Skip Connections

Hi, FCN-8s/16s (regardless of the base model being VGG/ResNet) should have skip connections for aggregating the features from pooling layers. But, I can't seem to find these in your model definitions.

No such file or directory: 'resnet_34_8s_66.pth'

Hi @warmspringwinds
I am getting - IOError: [Errno 2] No such file or directory: 'resnet_34_8s_66.pth'
when running resnet_34_8s_demo.ipynb

Not able to figure out when and how the model will generate. Any clue on the next step .

Thanks
Akash

Incompatible ResNet arguments

While executing any of the notebooks in: pytorch_segmentation_detection/recipes/pascal_voc/segmentation/*.ipynb
the errors relate to initialization regarding unknown arguments for ResNet object init. Tried on multiple cloud / GPU setups with same output. Maybe there are unchecked files (there could be cached files in your setups. On a clean repo pull, these errors might occur for you as well)
[python2.7, pytorch-0.3.1]:

TypeErrorTraceback (most recent call last)
<ipython-input-1-a4cbdc8e5706> in <module>()
     36 
     37 print(torch.__version__)
---> 38 fcn = resnet_dilated.Resnet34_8s(num_classes=21)
     39 fcn.load_state_dict(torch.load('resnet_34_8s_68.pth'))
     40 #fcn.load_state_dict(torch.load('resnet34-333f7ec4.pth'))

/models/pytorch-segmentation-detection/pytorch_segmentation_detection/models/resnet_dilated.py in __init__(self, num_classes)
    290         # Load the pretrained weights, remove avg pool
    291         # layer and get the output stride of 8
--> 292         resnet34_8s = models.resnet34(fully_conv=True, pretrained=True, output_stride=8, remove_avg_pool_layer=True)
    293         #resnet34_8s = models.resnet34(pretrained=True)
    294 

/usr/local/lib/python2.7/dist-packages/torchvision/models/resnet.pyc in resnet34(pretrained, **kwargs)
    172         pretrained (bool): If True, returns a model pre-trained on ImageNet
    173     """
--> 174     model = ResNet(BasicBlock, [3, 4, 6, 3], **kwargs)
    175     if pretrained:
    176         model.load_state_dict(model_zoo.load_url(model_urls['resnet34']))

TypeError: __init__() got an unexpected keyword argument 'fully_conv'

@warmspringwinds Could you please put in the requirements.txt file as well?
Will clear out a lot of confusions.

pip freeze > requirements.txt

Also, if you could also put in the resnet_34_8_66.pth, it will be helpful to execute the code without changes in resnet_34_8s_demo.ipynb

Many thanks!

Changing the number of classes

Hi there,

First, thanks a lot for the good work, it's really useful!

I am trying to train the model on 1 only one class (that class + background) using the code in resnet_34_8s_train.ipynb in a .py file . I am confident my dataset only has one class, so I change the number of class from 21 to 2, but I get the following error after when starting the first iteration:

RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1518238441757/work/torch/lib/THC/generic/THCStorage.cu:58

I just wanted to make sure that for only 1 class, I should set number_of_classes = 2 instead of 21, and that you are able to make work with you home code work with a different number of classes? The full error is below:

  File "<ipython-input-1-574834e79b43>", line 1, in <module>
    runfile('/home/ft_fcnpt/pytorch-segmentation-detection-master/pytorch_segmentation_detection/recipes/pascal_voc/segmentation/py_version2.py', wdir='/home/john/ft_fcnpt/pytorch-segmentation-detection-master/pytorch_segmentation_detection/recipes/pascal_voc/segmentation')

  File "/home/anaconda3/envs/pt27/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py", line 705, in runfile
    execfile(filename, namespace)

  File "/home/anaconda3/envs/pt27/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py", line 94, in execfile
    builtins.execfile(filename, *where)

  File "/home/ft_fcnpt/pytorch-segmentation-detection-master/pytorch_segmentation_detection/recipes/pascal_voc/segmentation/py_version2.py", line 280, in <module>
    loss.backward()

  File "/home/anaconda3/envs/pt27/lib/python2.7/site-packages/torch/autograd/variable.py", line 167, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)

  File "/home/anaconda3/envs/pt27/lib/python2.7/site-packages/torch/autograd/__init__.py", line 99, in backward
    variables, grad_variables, retain_graph)

  File "/home/anaconda3/envs/pt27/lib/python2.7/site-packages/torch/autograd/function.py", line 91, in apply
    return self._forward_cls.backward(self, *args)

  File "/home/anaconda3/envs/pt27/lib/python2.7/site-packages/torch/nn/_functions/thnn/upsampling.py", line 283, in backward
    grad_input = UpsamplingBilinear2dBackward.apply(grad_output, ctx.input_size, ctx.output_size)

  File "/home/anaconda3/envs/pt27/lib/python2.7/site-packages/torch/nn/_functions/thnn/upsampling.py", line 296, in forward
    grad_output = grad_output.contiguous()

about image size of training set

Hello Daniil,
In the training process of your ResNet-8s, I notice that you crop all training images to 224x224 (RandomCropJoint(crop_size=(224, 224))). But you didn't adopt this approach when you train your FCN-32s model. Is it because the ResNet pretrained model is used as initial weights so we need to comply with its input image size (224x224) too? Do you think other input size can be used for training, without causing accuracy decline? Please advice. Thanks.

Train on my own dataset without superpixels

First of all, great work!
I assume that in order to use it I have to train the weights on a dataset with superpixels annotations, or at least some boundry boxes.
Am I right?
What do you suggest me to do if now I have lots of clean videos which I want to make segmentation on some features that are moving in the video?
Should I start with simple boundary boxes classification and only than proceed? or that there is any shortcut for that?

Thanks

new model implementation

I would like to test a new model. Is there a walk through of what i would need to do if I want to test a new model with your system?

CRF implementation

Hi! Thank you for sharing your code :)
Where can I find CRF implementation in this repo? I couldn't find any result when searching with 'crf' keyword in this repo.

warmspringwinds / pytorch-segmentation-detection Goto Github PK

pytorch-segmentation-detection's Issues

It says

Recommend Projects

Recommend Topics

Recommend Org