Giter Club home page Giter Club logo

fcn-for-semantic-segmentation's Introduction

FCN-for-Semantic-Segmentation

Implementation and testing the performance of FCN-16 and FCN-8. In addition to that CRFs are used as a post processing technique and results are compared.

PAPERS REFERRED :

  1. FULLY CONVOLUTIONAL NETWORKS FOR SEMANTIC SEGMENTATION
  1. VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE SCALE IMAGE RECOGNITION
  1. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

IMPLEMENTATION STEPS :

1. Converting a classifier to dense FCN :

The model which is used for the task of semantic segmentation is derived from VGG. VGG on it's own is meant for classification task. So to make the model suitable for dense prediction we remove the last fully connected layers of VGG and replace them with convolutions. We append a 1x1 convolution with channel dimension 21 to predict scores for each of the PASCAL classes (including background) at each of the coarse output locations, followed by a deconvolution layer to bilinearly upsample the coarse outputs to pixel-dense outputs.

2. Transferring features of lower level layers to higher layers

We define a new fully convolutional net (FCN) for segmentation that combines layers of the feature hierarchy and refines the spatial precision of the output. While fully convolutionalized classifiers can be fine-tuned to segmentation and even score highly on the standard metric, their output is dissatisfyingly coarse. The 32 pixel stride at the final prediction layer limits the scale of detail in the upsampled output.

We address this by adding skips that combine the final prediction layer with lower layers with finer strides. This turns a line topology into a DAG with edges that skip ahead from lower layers to higher ones. As they see fewer pixels, the finer scale predictions should need fewer layers, so it makes sense to make them from shallower net outputs. Combining fine layers and coarse layers lets the model make local predictions that respect global structure.

We first divide the output stride in half by predicting from a 16 pixel stride layer. We add a 1x1 convolution layer on top of pool4 to produce additional class predictions. We fuse this output with the predictions computed on top of conv7 (convolutionalized fc7) at stride 32 by adding a 2x upsampling layer and summing both predictions.

Finally, the stride 16 predictions are upsampled back to the image.

We call this net FCN-16s. FCN-16's have only one skip connection which transferring the information from 4th Max pooling layer. To improve the results further we introduce one more skip connection which transfer information from 3rd Max pooling layer also with the skip connection which transfers information from 4th Max pooling layer.

3 . Using CRF as post processing technique :

While predicting using FCN we gave label to each pixel independently of it's surrounding pixels, this may result in coarse segmentation. CRF takes two inputs one is the original image and the other is predicted probabilities for each pixel. The CRF which was uses a highly efficient inference algorithm for fully connected CRF models in which the pairwise edge potentials are defined by a linear combination of Gaussian kernels in an arbitrary feature space. Therby it considers the surrounding pixels also while assigning the class to particular pixel which results in better semantic segmentation results.

INSTALLATION OF REQUIRED TOOLS

1. Tensorflow

Refer to the following link https://www.tensorflow.org/install/install_sources. Tensorflow is used as backend for Keras. The link contains installation instructions with and without gpu support

2. Keras

Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. It was developed with a focus on enabling fast experimentation.

  • To install Keras

    sudo pip install keras

3. Matplotlib

Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms.

4.Skimage

Scikit-image is an image processing toolbox for SciPy. It is used for loading,saving and applying various transformations like color to gray and gray to color on images.

5. graphviz

This package facilitates the creation and rendering of graph descriptions in the DOT language of the Graphviz graph drawing software from Python. It is required to plot the models in keras.

  • To install graphviz

    sudo pip install graphviz

6. Jupyter Notebook

The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text

Quick start

Run the following commands in Terminal.

git clone https://github.com/Gurupradeep/FCN-for-Semantic-Segmentation.git
cd FCN-for-Semantic-Segmentation
jupyter notebook

It opens up all the notebooks which are there in the directory in the browser.

  • FCN-16.ipynb contains code related to implementation of FCN-16.

  • FCN-8.ipynb contains code related to implementation of FCN-8

  • Comparison_of_fcn8_and_fcn16.ipynb has code which compares results of FCN-8 and FCN-16 models.

  • CRF.ipynb has code which is used to compare the results after applying CRF on FCN-8 and FCN-16 annotated images.

  • All the images which are used can be found in Testimages Folder(https://github.com/Gurupradeep/FCN-for-Semantic-Segmentation/tree/master/TestImages)

Open respective notebooks and run the commands to reproduce the results. As we are running in jupyter notebook we can see results after executing every command.

fcn-for-semantic-segmentation's People

Contributors

gurupradeep avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fcn-for-semantic-segmentation's Issues

Coarse output from the network

Hello, thanks for your work, I have tried to rewrite your network using PyTorch, but what I got from the network is a coarse image where I can only see the profile of my segmentation object, would you like to tell me where I was wrong, thanks!

my model code is like this:

import torchvision.models as models
import torch.nn as nn
import torch.nn.functional as F

# referred to this site: https://github.com/Gurupradeep/FCN-for-Semantic-Segmentation
class MyFCN(nn.Module):
    def __init__(self):
        super().__init__()
        model = models.vgg16(pretrained=True)
        self.backbone_third = model.features[:17]  # (256, 28, 28) third pooling before conv layer
        self.backbone_fourth = model.features[:24] # (512, 14, 14) fourth pooling before conv layer
        self.backbone_fifth = model.features[:31]  # (512, 7, 7) final pooling before conv layer

        self.conv_256_1 = nn.Sequential(
            nn.Conv2d(256, 1, (1, 1), 1),
        )

        self.conv_512_1 = nn.Sequential(
            nn.Conv2d(512, 1, (1, 1), 1),
        )

        # fc6
        self.conv_512_4096 = nn.Sequential(
            nn.Conv2d(512, 4096, (7, 7), 1, 3),
            nn.ReLU(inplace=True),
        )

        # fc7
        self.conv_4096_4096 = nn.Sequential(
            nn.Conv2d(4096, 4096, (1, 1), 1),
            nn.ReLU(inplace=True),
        )

        # score_fr
        self.conv_4096_1 = nn.Sequential(
            nn.Conv2d(4096, 1, (1, 1), 1),
            nn.ReLU(inplace=True),
        )

        # score_2 for 7=>14 and 14=>28
        self.conv_transpose = nn.Sequential(
            nn.ConvTranspose2d(1, 1, (4, 4), 2),
        )

        # final upsample
        self.conv_transpose_8 = nn.Sequential(
            nn.ConvTranspose2d(1, 1, (16, 16), 8),
        )


    def forward(self, x):
        x_from_pooling_3 = self.backbone_third(x)
        x_from_pooling_4 = self.backbone_fourth(x)
        x_from_pooling_5 = self.backbone_fifth(x)

        # pooling 3
        x_3 = self.conv_256_1(x_from_pooling_3)

        # pooling 4
        x_4 = self.conv_512_1(x_from_pooling_4)     # (1, 1, 14, 14)

        # pooling 5
        x_5 = self.conv_512_4096(x_from_pooling_5)  # (1, 4096, 7, 7)
        x_5 = self.conv_4096_4096(x_5)              # (1, 4096, 7, 7)
        x_5 = self.conv_4096_1(x_5)                 # (1, 1, 7, 7)
        x_5 = self.conv_transpose(x_5)              # (1, 1, 16, 16)
        x_5 = F.pad(x_5, (-1, -1, -1, -1))          # crop layer, (1, 1, 14, 14)

        # fusing x_4
        x_fused_1 = x_4 + x_5                       # (1, 1, 14, 14)
        x_fused_1 = self.conv_transpose(x_fused_1)  # (1, 1, 30, 30)
        x_fused_1 = F.pad(x_fused_1, (-1, -1, -1, -1))  # crop layer, (1, 1, 28, 28)

        # fusing x_3
        x_fused_2 = x_3 + x_fused_1
        x_fused_2 = self.conv_transpose_8(x_fused_2)    # (1, 1, 232, 232)
        x_fused_2 = F.pad(x_fused_2, (-4, -4, -4, -4))  # crop layer (1, 1, 224, 224)

        return x_fused_2

and my output is like:
epoch_17

pydensecrf package problem

I want to use CRF.ipynb to post processing, I got an issue: module 'pydensecrf.densecrf' has no attribute 'DenseCRF2D'.
(before that :pip install pydensecrf)

Where is the weights.h5 file?

I tried running your code but I always come across an error that says weights.h5 cannot be found. I downloaded another weights file, but using that shows another error saying "you are trying to load a 0 layer model into a 19 layer model". Upon further digging I found out that this has something to do with the version of keras being used.

So my questions are:

  1. Where is the weights.h5 file?
  2. If I can't get that file, what is the version of keras you used to build this project?

Thank you

Why do I use CRF code to come out like this?

Hello, Thanks for sharing the code. But I have a question, why do I use the CRF code to come out like this?

image

PS: My train image data is three channel image. And my prediction image is already colored three-channel image data. I used the CRF function to pass in the original image, the colored image and the output image, I don't know if it's a mistake?

pascal-fcn16s-dag.mat

Hello,can you tell me how i can get the pascal-fcn16s-dag.mat which is used in fcn_16.ipynb

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.