Giter Club home page Giter Club logo

multiseg's Introduction

Work in Progress, Not Production Ready

Multiple Video Instance Object Segmentation and Tracking

Overview

The goal of this model is to segment multiple object instances from both video and still images and track (identify) objects over consecutive frames. There are 3 major modules:

  • Image Segmentation Module
    • Mask R-CNN
  • Mask Refine Module
    • PWC-Net
    • U-Net
  • Identification Module
    • Triplet Network
    • FAISS Database

MultiSeg Network Diagram

Instructions for Use

This project was developed on Python 3.6, Tensorflow 1.11, Keras 2.1, and Numpy 1.13. To ensure full compatibility, use these versions (but it may work with other versions).

1. Install dependencies

If you are on the shared Google instance (for UMD FIRE COML), there is already a conda virtual env with the correct dependencies. To start it, run

source activate multiseg

Otherwise, run the setup.py script, which will also install the dependencies for this project (using pip):

python setup.py install

If the script fails, you may have to manually install each dependency using pip (pip is required for some dependencies; conda does not work).

2. Acquire datasets

Download datasets

You can acquire the full DAVIS 2017 and full WAD CVPR 2018 datasets at their respective sources (warning, the WAD dataset is extremely large, which is why we're providing alternatives), but we also have a "mini-DAVIS" and a "mini-WAD" dataset that has the same folder structure but only contains a very small subset of the images--which allows for easier testing and evaluation.

  • For CVPR WAD 2018, there is only the 'train' subset (and only certain images within the subset. No annotations.).

  • For DAVIS 2017, there is only the 'trainval' subset at 480p resolution (and only some of the images/videos within the subset).

The mini-datasets can be found on our github repository's releases page. We are currently on release version 0.1.

Expected Data Directory Structures

As a quick check, make sure the following paths exist (starting from the root data directory):

CVPR WAD 2018: .\train_color\

DAVIS 2017: .\DAVIS-2017-trainval-480p\DAVIS\JPEGImages\480p\blackswan\

3. Download pre-trained weights

Currently, we have released pre-trained weights for the following modules only:

  • Image Segmentation
    • Default Directory: .\image_seg
  • Optical Flow
    • Default Directory: .\opt_flow\models\pwcnet-lg-6-2-multisteps-chairsthingsmix
    • Note: these weights are saved as a tensorflow checkpoint, so you will need to place all 3 files in this directory
  • Mask Refine (coarse only)
    • Default Directory: .\mask_refine

As with our mini-datasets, the weights binaries for the image segmentation and mask refine modules can be found on our github repository's releases page.

For the optical flow model weights, you can find the latest versions here (make sure to download the one that matches the directory name specified above).

4. Run a demo inference script

It's very easy to run these notebooks:

  1. In the first few cells, make sure to check that you've downloaded the file dependencies and have in the right location (or change the path in the code).
  2. Then, simply execute each cell in order.
  3. To rerun the inference for difference images, simply rerun the cells in the inference section of each notebook, and a new random image will be chosen.

Instance Segmentation (Matterport implementation of Mask R-CNN)

Notebook: Instance Segmentation Notebook

Dataset: CVPR WAD 2018

Mask Refine

Notebook: Mask Refine Demo Notebook

Dataset: DAVIS 2017

Instance Identification

See separate repository (linked under the instance_id directory). In the future, when this module is more mature, it will be integrated into the current repository.

Future Work

  • integrate ImageSeg and MaskRefine
  • develop triplet network in keras
  • refine MaskRefine using new loss function
  • integrate all 3 modules
  • evaluation & metrics

Samples

Coarse Mask Refine Module Outputs Coarse Mask Refine Inputs and Outputs

multiseg's People

Contributors

tmthyln avatar kjhurley99 avatar bencarlisle15 avatar raytu avatar joshrclo avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.