Giter Club home page Giter Club logo

loco's Introduction

LOcal COntext based Faster R-CNN

A local context layer is implemented based on Faster R-CNN(see: py-faster-rcnn code) for detecting small objects more Effectively

Contents

  1. Requirements: software
  2. Requirements: hardware
  3. Basic installation
  4. Data preparation
  5. Training and Testing
  6. Usage

Requirements: software

  1. Requirements for Caffe and pycaffe (see: Caffe installation instructions)

Note: Caffe must be built with support for Python layers!

# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
# Unrelatedly, it's also recommended that you use CUDNN
USE_CUDNN := 1

You can download Makefile.config for reference. 2. Python packages you might not have: cython, python-opencv, easydict

Requirements: hardware

  1. For training smaller networks (ZF, VGG_CNN_M_1024) a good GPU (e.g., Titan, K20, K40, ...) with at least 3G of memory suffices
  2. For training Fast R-CNN with VGG16, you'll need a K40 (~11G of memory)
  3. For training the end-to-end version of Faster R-CNN or LOCO with VGG16, 3G of GPU memory is sufficient (using CUDNN)

Installation (sufficient for the demo)

  1. Clone the LOCO repository
git clone https://github.com/CPFLAME/LOCO.git
  1. We'll call the directory that you cloned LOCO into LOCO

    you'll need to manually clone the caffe-fast-rcnn submodule:

    cd LOCO
    git clone https://github.com/rbgirshick/caffe-fast-rcnn.git
  2. Build the Cython modules

    cd $LOCO/lib
    make
  3. Build Caffe and pycaffe

    cd $LOCO/caffe-fast-rcnn
    # Now follow the Caffe installation instructions here:
    #   http://caffe.berkeleyvision.org/installation.html
    
    # If you're experienced with Caffe and have all of the requirements installed
    # and your Makefile.config in place, then simply do:
    make -j8 && make pycaffe

Data preparation

  1. Download the traffic dataset

    the dataset is modified from Tsinghua-Tencent 100K a traffic sign dataset. We treat all traffic signs as one category and transfer it into VOC's annotation format. For convenience, we call it VOCdevkit2007.

    Download these four files: http://pan.baidu.com/s/1o8n2bVG

  2. Extract all of these tars into one directory named VOCdevkit2007

    cat VOCdevkit2007.tar0* | tar -xzv
  3. It should have this basic structure

    $VOCdevkit2007/                           # development kit
    $VOCdevkit2007/VOC2007                    # image sets, annotations, etc.
    # ... and several other directories ...
  4. Create symlinks for the PASCAL VOC dataset

    cd $LOCO/data
    ln -s $VOCdevkit2007 VOCdevkit2007

    Using symlinks is a good idea because you will likely want to share the same PASCAL dataset installation between multiple projects.

Training and Testing

  1. For training

    you should download pre-trained ImageNet models

    cd $LOCO
    ./data/scripts/fetch_imagenet_models.sh

    VGG16 comes from the Caffe Model Zoo, but is provided here for your convenience. ZF was trained at MSRA.

    start training

    cd $LOCO
    ./experiments/scripts/faster_rcnn_context.sh [GPU_ID] VGG16 pascal_voc
  2. For testing We released our pretrained model at model, you can download it for testing.

    start testing

    cd $LOCO
    ./tools/test_net.py --gpu [GPU_ID] --def models/pascal_voc/VGG16/faster_rcnn_end2end/context_test.prototxt --net $your_model_path --imdb voc_2007_test --cfg experiments/cfgs/faster_rcnn_end2end.yml

Usage

Trained networks are saved under:

output/<experiment directory>/<dataset name>/

Test outputs are saved under:

output/<experiment directory>/<dataset name>/<network snapshot name>/

loco's People

Contributors

cpflame avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.