Giter Club home page Giter Club logo

pytorch-segmentation-detection's Introduction

Image Segmentation and Object Detection in Pytorch

Pytorch-Segmentation-Detection is a library for dense inference and training of Convolutional Neural Networks (CNNs) on Images for Segmentation and Detection. The aim of the library is to provide/provide a simplified way to:

  • Converting some popular general/medical/other Image Segmentation and Detection Datasets into easy-to-use for training format (Pytorch's dataloader).
  • Training routine with on-the-fly data augmentation (scaling, color distortion).
  • Training routine that is proved to work for particular model/dataset pair.
  • Evaluating Accuracy of trained models with common accuracy measures: Mean IOU, Mean pix. accuracy, Pixel accuracy, Mean AP.
  • Model files that were trained on a particular dataset with reported accuracy (models that were trained using this library with reported training routine and not models that were converted from Caffe or other framework)
  • Model definitions (like FCN-32s and others) that use weights initializations from Image Classification models like VGG that are officially provided by Pytorch/Vision library.

So far, the library contains an implementation of FCN-32s (Long et al.), Resnet-18-8s, Resnet-34-8s (Chen et al.) image segmentation models in Pytorch and Pytorch/Vision library with training routine, reported accuracy, trained models for PASCAL VOC 2012 dataset. To train these models on your data, you will have to write a dataloader for your dataset.

Models for Object Detection will be released soon.

Installation

This code requires:

  1. Pytorch.

  2. Some libraries which can be acquired by installing Anaconda package.

    Or you can install scikit-image, matplotlib, numpy using pip.

  3. Clone the library:

git clone --recursive https://github.com/warmspringwinds/pytorch-segmentation-detection

And use this code snippet before you start to use the library:

import sys
# update with your path
# All the jupyter notebooks in the repository already have this
sys.path.append("/your/path/pytorch-segmentation-detection/")
sys.path.insert(0, '/your/path/pytorch-segmentation-detection/vision/')

Here we use our pytorch/vision fork, which might be merged and futher merged in a future. We have added it as a submodule to our repository.

  1. Download segmentation or detection models that you want to use manually (links can be found below).

PASCAL VOC 2012 (Segmentation)

Implemented models were tested on Restricted PASCAL VOC 2012 Validation dataset (RV-VOC12) and trained on the PASCAL VOC 2012 Training data and additional Berkeley segmentation data for PASCAL VOC 12. It was important to test models on restricted Validation dataset to make sure no images in the validation dataset were seen by model during training.

The code to acquire the training and validating the model is also provided in the library.

Fully Convolutional Networks for Semantic Segmentation (FCNs)

Here you can find models that were described in the paper "Fully Convolutional Networks for Semantic Segmentation" by Long et al. We trained and tested FCN-32s, FCN-16s (in prog.) and FCN-8s (in prog.) against PASCAL VOC 2012 dataset.

You can find all the scripts that were used for training and evaluation here.

This code has been used to train networks with this performance:

Model Test data Mean IOU Mean pix. accuracy Pixel accuracy Inference time (512x512 px. image) Model Download Link
FCN-32s (ours) RV-VOC12 60.0 in prog. in prog. 41 ms. Dropbox
FCN-16s (ours) RV-VOC12 in prog. in prog. in prog. in prog. in prog.
FCN-8s (ours) RV-VOC12 in prog. in prog. in prog. in prog in prog.
FCN-32s (orig.) RV-VOC11 59.40 73.30 89.10 in prog.
FCN-16s (orig.) RV-VOC11 62.40 75.70 90.00 in prog.
FCN-8s (orig.) RV-VOC11 62.70 75.90 90.30 in prog.

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Here you can find models that were described in the paper "Fully Convolutional Networks for Semantic Segmentation" by Long et al. We trained and tested Resnet-18-8s, Resnet-34-8s against PASCAL VOC 2012 dataset.

You can find all the scripts that were used for training and evaluation here.

This code has been used to train networks with this performance:

Model Test data Mean IOU Mean pix. accuracy Pixel accuracy Inference time (512x512 px. image) Model Download Link
Resnet-18-8s (ours) RV-VOC12 59.0 in prog. in prog. 28 ms. Dropbox
Resnet-34-8s (ours) RV-VOC12 66.0 in prog. in prog. 50 ms. Dropbox
Resnet-101-8s (ours) RV-VOC12 in prog. in prog. in prog. in prog in prog.
Resnet-18-8s (ours) RV-VOC11 n/a n/a n/a n/a
Resnet-34-8s (ours) RV-VOC11 n/a n/a n/a n/a
Resnet-101-8s (ours) RV-VOC11 69.0 n/a n/a 180 ms.

Applications

We demonstrate applications of our library for a certain tasks which are being ported/ has already been ported to mobile devices:

  1. Sticker creation

  2. Iphone's portait effect

  3. Background replacement

  4. Surgical Robotic Tools Segmentation (see below)

About

If you used the code for your research, please, cite the paper:

@article{pakhomov2017deep,
  title={Deep Residual Learning for Instrument Segmentation in Robotic Surgery},
  author={Pakhomov, Daniil and Premachandran, Vittal and Allan, Max and Azizian, Mahdi and Navab, Nassir},
  journal={arXiv preprint arXiv:1703.08580},
  year={2017}
}

During implementation, some preliminary experiments and notes were reported:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.