Giter Club home page Giter Club logo

rrc_detection's Introduction

Accurate Single Stage Detector Using Recurrent Rolling Convolution

By Jimmy Ren, Xiaohao Chen, Jianbo Liu, Wenxiu Sun, Jiahao Pang, Qiong Yan, Yu-Wing Tai, Li Xu.

Introduction

High localization accuracy is crucial in many real-world applications. We propose a novel single stage end-to-end object detection network (RRC) to produce high accuracy detection results. You can use the code to train/evaluate a network for object detection task. For more details, please refer to our paper (https://arxiv.org/abs/1704.05776).

method KITTI test mAP car (moderate)
Mono3D 88.66%
SDP+RPN 88.85%
MS-CNN 89.02%
Sub-CNN 89.04%
RRC (single model) 89.85%

KITTI ranking

Citing RRC

Please cite RRC in your publications if it helps your research:

@inproceedings{Ren17CVPR,
author = {Jimmy Ren and Xiaohao Chen and Jianbo Liu and Wenxiu Sun and Jiahao Pang and Qiong Yan and Yu-Wing Tai and Li Xu},
title = {Accurate Single Stage Detector Using Recurrent Rolling Convolution},
booktitle = {CVPR},
year = {2017}
}

Contents

  1. Installation
  2. Preparation
  3. Train/Eval
  4. Models
  5. Ackonwledge

Installation

  1. Get the code. We will call the directory that you cloned Caffe into $CAFFE_ROOT
    https://github.com/xiaohaoChen/rrc_detection.git
    cd rrc_detection
  2. Build the code. Please follow Caffe instruction to install all necessary packages and build it. Before build it, you should install CUDA and CUDNN(v5.0).
    CUDA 7.5 and CUDNN v5.0 were adapted in our computer.
    # Modify Makefile.config according to your Caffe installation.
    cp Makefile.config.example Makefile.config
    make -j8
    # Make sure to include $CAFFE_ROOT/python to your PYTHONPATH.
    make py
    make test -j8
    make runtest -j8

Preparation

  1. Download fully convolutional reduced (atrous) VGGNet. By default, we assume the model is stored in $CAFFE_ROOT/models/VGGNet/.

  2. Download the KITTI dataset(http://www.cvlibs.net/datasets/kitti/eval_object.php). By default, we assume the data is stored in $HOME/data/KITTI/
    ย  Unzip the training images, testing images and the labels in $HOME/data/KITTI/.

  3. Create the LMDB file. For training . As only the images contain cars are adopted as training set for car detection, the labels for cars should be extracted.
    We have provided the list of images contain cars in $CAFFE_ROOT/data/KITTI-car/.

    # extract the labels for cars
    cd $CAFFE_ROOT/data/KITTI-car/
    ./extract_car_label.sh

    Before create the LMDB files. The labels should be converted to VOC type. We provide some matlab scripts.
    The scripts are in $CAFFE_ROOT/data/convert_labels/. Just modify converlabels.m.

    line 4: root_dir = '/your/path/to/KITTI/';

    VOC type labels will be generated in $KITTI_ROOT/training/labels_2car/xml/.

    cd $CAFFE_ROOT/data/KITTI-car/
    # Create the trainval.txt, test.txt, and test_name_size.txt in data/KITTI-car/
    ./create_list.sh
    # You can modify the parameters in create_data.sh if needed.
    # It will create lmdb files for trainval and test with encoded original image:
    #   - $HOME/data/KITTI/lmdb/KITTI-car_training_lmdb/
    #   - $HOME/data/KITTI/lmdb/KITTI-car_testing_lmdb/
    # and make soft links at data/KITTI-car/lmdb
     ./data/KITTI-car/create_data.sh

Train/Eval

  1. Train your model and evaluate the model.
    # It will create model definition files and save snapshot models in:
    #   - $CAFFE_ROOT/models/VGGNet/KITTI/RRC_2560x768_kitti_car/
    # and job file, log file in:
    #   - $CAFFE_ROOT/jobs/VGGNet/KITIIT/RRC_2560x768_kitti_car/
    # After 60k iterations, we can get the model as we said in the paper (mAP 89.*% in KITTI).
    python examples/car/rrc_kitti_car.py
    # Before run the testing script. You should modify [line 10: img_dir] to [your path to kitti testing images].
    python examples/car/rrc_test.py
    We train our models in a computer with 4 TITAN X(Maxwell) GPU cards. By default, we assume you train the models on mechines with 4 TITAN X GPUs.
    If you only have one TITAN X card, you should modify the script rrc_kitti.py.
    line 118: gpus = "0,1,2,3" -> gpus = "0"
    line 123: batch_size = 4   -> batch_size = 1
    If you have two TITAN X cards, you should modify the script rrc_kitti.py as follow.
    line 118: gpus = "0,1,2,3" -> gpus = "0,1"
    line 123: batch_size = 4   -> batch_size = 2
    You can submit the result at kitti submit. If you don't have time to train your model, you can download a pre-trained model from the link as follow.
    Google Drive
    Baidu Cloud
    Unzip the files in $caffe_root/models/VGGNet/KITTI/, and run the testing script rrc_test.py, you will get the same result as the single model result we showed in the paper.
    # before run the script, you should modify the kitti_root at line 10.
    # Make sure that the work directory is caffe_root
    cd $caffe_root
    python models/VGGNet/KITTI/RRC_2560x768_kitti_4r4b_max_size/rrc_test.py
  2. Evaluate the most recent snapshot. For testing a model you trained, you show modify the path in rrc_test.py.

Acknowledge

Thanks to Wei Liu, we have benifited a lot from his previous work SSD (Single Shot Multibox Detector) and his code.

rrc_detection's People

Contributors

xiaohaochen avatar jimmy-ren avatar

Watchers

James Cloos avatar khanhnd avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.