Giter Club home page Giter Club logo

dnl-object-detection's Introduction

DNLNet for Object Detection

By Minghao Yin, Zhuliang Yao, Yue Cao, Xiu Li, Zheng Zhang, Stephen Lin, Han Hu.

This repo is a official implementation of "Disentangled Non-Local Neural Networks" on COCO object detection based on open-mmlab's mmdetection. Many thanks to mmdetection for their simple and clean framework.

Introduction

DNLNet is initially described in arxiv. It is still in progress.

Citing DNLNet

@misc{yin2020disentangled,
    title={Disentangled Non-Local Neural Networks},
    author={Minghao Yin and Zhuliang Yao and Yue Cao and Xiu Li and Zheng Zhang and Stephen Lin and Han Hu},
    year={2020},
    booktitle={ECCV}
}

Main Results

Results on R50-FPN with backbone (syncBN)

Back-bone Model Backbone Norm Heads Context Lr schd box AP mask AP Download
R50-FPN Mask SyncBN 4Conv1FC - 1x 38.8 35.1 model | log
R50-FPN Mask SyncBN 4Conv1FC NL(c4) 1x 39.6 35.8 model | log
R50-FPN Mask SyncBN 4Conv1FC GC(c4, r4) 1x 40.1 36.2 model | log
R50-FPN Mask SyncBN 4Conv1FC SNL(c4) 1x 40.1 36.2 model | log
R50-FPN Mask SyncBN 4Conv1FC DNL(c4) 1x 40.3 36.4 model | log
R50-FPN Mask SyncBN 4Conv1FC DNL(c4+c5_all) 1x 41.2 37.2 model | log

Results on stronger backbones

  • On going

Notes

  • NL denotes Non-local block block is inserted after 1x1 conv of backbone.
  • GC denotes Global Context (GC) block is inserted after 1x1 conv of backbone.
  • SNL denotes Simplified Non-local block block is inserted after 1x1 conv of backbone.
  • DNL denotes Disentangled Non-local block block is inserted after 1x1 conv of backbone.
  • r4 denotes ratio 4 in GC block.
  • c4 and c5_all denote insert context block at stage c4's last residual block and c5's all blocks, respectively.
  • Most models are trained on 16 GPUs with 4 images on each GPU.

Requirements

  • Linux(tested on Ubuntu 16.04)
  • Python 3.6+
  • Cython
  • PyTorch 1.1.0
  • CUDA 9.0
  • CUDNN 7.0
  • NCCL 2.3.5
  • apex
  • inplace_abn

Install

a. Install PyTorch 1.1 and torchvision following the official instructions.

b. Install latest apex with CUDA and C++ extensions following this instructions. The inplace_abn implemented by apex is required.

c. Clone the DNLNet repository.

 git clone https://github.com/Howal/DNL-Object-Detection.git

d. Install DNLNet version mmdetection (other dependencies will be installed automatically).

cd DNL-Object-Detection
python(3) setup.py build develop  # add --user if you want to install it locally
# or "pip install -e -v ."

Please refer to mmdetection install instruction for more details.

Usage

Train

As in original mmdetection, distributed training is recommended for either single machine or multiple machines.

./tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> [optional arguments]

Supported arguments are:

  • --validate: perform evaluation every k (default=1) epochs during the training.
  • --work_dir <WORK_DIR>: if specified, the path in config file will be replaced.

Evaluation

To evaluate trained models, output file is required.

python tools/test.py <CONFIG_FILE> <MODEL_PATH> [optional arguments]

Supported arguments are:

  • --gpus: number of GPU used for evaluation
  • --out: output file name, usually ends wiht .pkl
  • --eval: type of evaluation need, for mask-rcnn, bbox segm would evaluate both bounding box and mask AP.

dnl-object-detection's People

Contributors

howal avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.