Giter Club home page Giter Club logo

osvos-pytorch's Introduction

OSVOS: One-Shot Video Object Segmentation

Check our project page for additional information. OSVOS

OSVOS is a method that tackles the task of semi-supervised video object segmentation. It is based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one-shot). Experiments on DAVIS 2016 show that OSVOS is faster than currently available techniques and improves the state of the art by a significant margin (79.8% vs 68.0%).

This PyTorch code is a posteriori implementation of OSVOS and it does not contain the boundary snapping branch. The results published in the paper were obtained using the Caffe version that can be found at OSVOS-caffe. TensorFlow implementation is also available at OSVOS-TensorFlow.

Installation:

  1. Clone the OSVOS-PyTorch repository

    git clone https://github.com/kmaninis/OSVOS-PyTorch.git
  2. Install - if necessary - the required dependencies:

    • Python (tested with Anaconda 2.7 and 3.6)
    • PyTorch (conda install pytorch torchvision -c pytorch - tested with PyTorch 0.3, CUDA 8.0)
    • Other python dependencies: numpy, scipy, matplotlib, opencv-python
    • Optionally, install tensorboard (pip install tensorboard tensorboardx)
  3. Edit the paths in mypath.py

Online training and testing

  1. Download the parent model (55 MB), and unzip it under models/.
  2. Edit in file osvos_demo.py the 'User defined parameters' (eg. gpu_id, etc).
  3. Run python train_online.py.

Training the parent network (optional)

  1. All the training sequences of DAVIS 2016 are required to train the parent model, thus download them from here.
  2. Download the VGG model (55 MB) pretrained on ImageNet, and unzip it under models/.
  3. Edit the 'User defined parameters' (eg. gpu_id) in file train_parent.py.
  4. Run train_parent.py. This step takes 20 hours to train (Titan-X Pascal).

Enjoy!

Citation:

@Inproceedings{Cae+17,
  Title          = {One-Shot Video Object Segmentation},
  Author         = {S. Caelles and K.K. Maninis and J. Pont-Tuset and L. Leal-Taix\'e and D. Cremers and L. {Van Gool}},
  Booktitle      = {Computer Vision and Pattern Recognition (CVPR)},
  Year           = {2017}
}

If you encounter any problems with the code, want to report bugs, etc. please contact us at {kmaninis, scaelles}[at]vision[dot]ee[dot]ethz[dot]ch.

osvos-pytorch's People

Contributors

jponttuset avatar kmaninis avatar scaelles avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.