Giter Club home page Giter Club logo

dpns's Introduction

Dual Path Networks

This repository contains the code and trained models of:

Yunpeng Chen, Jianan Li, Huaxin Xiao, Xiaojie Jin, Shuicheng Yan, Jiashi Feng. "Dual Path Networks" (NIPS17).

example

  • DPNs helped us won the 1st place in Object Localization Task in ILSVRC 2017, with all competition tasks within Top 3. (Team: NUS-Qihoo_DPNs)

Implementation

DPNs are implemented by MXNet @92053bd.

Augmentation

Method Settings
Random Mirror True
Random Crop 8% - 100%
Aspect Ratio 3/4 - 4/3
Random HSL [20,40,50]

Note: We did not use PCA Lighting and any other advanced augmentation methods. Input images are resized by bicubic interpolation.

Normalization

The augmented input images are substrated by mean RGB = [ 124, 117, 104 ], and then multiplied by 0.0167.

Mean-Max Pooling

Here, we introduce a new testing technique by using Mean-Max Pooling which can further improve the performance of a well trained CNN in the testing phase without the need of any training/fine-tuining process. This testing technique is designed for the case when the testing images is larger than training crops. The idea is to first convert a trained CNN model into a convolutional network and then insert the following Mean-Max Pooling layer (a.k.a. Max-Avg Pooling), i.e. 0.5 * (global average pooling + global max pooling), just before the final softmax layer.

Based on our observations, Mean-Max Pooling consistently boost the testing accuracy. We adopted this testing strategy in both LSVRC16 and LSVRC17.

Results

ImageNet-1k

Single Model, Single Crop Validation Error:

   
Model Size GFLOPs 224x224 320x320 320x320
( with mean-max pooling )
Top 1 Top 5 Top 1 Top 5 Top 1 Top 5
DPN-68 49 MB 2.5 23.57 6.93 22.15 5.90 21.51 5.52
DPN-92 145 MB 6.5 20.73 5.37 19.34 4.66 19.04 4.53
DPN-98 236 MB 11.720.15 5.15 18.94 4.44 18.72 4.40
DPN-131 304 MB 16.0 19.93 5.12 18.62 4.23 18.55 4.16

ImageNet-1k (Pretrained on ImageNet-5k)

Single Model, Single Crop Validation Error:

Model Size GFLOPs 224x224 320x320 320x320
( with mean-max pooling )
Top 1 Top 5 Top 1 Top 5 Top 1 Top 5
DPN-68 49 MB 2.5 22.45 6.09 20.92 5.26 20.62 5.07
DPN-92 145 MB 6.5 19.98 5.06 19.00 4.37 18.79 4.19
DPN-107 333 MB 18.3 19.75 4.94 18.34 4.19 18.15 4.03

Note: DPN-107 is not well trained.

ImageNet-5k

Single Model, Single Crop Validation Accuracy:

Model Size GFLOPs 224x224 320x320 320x320
( with mean-max pooling )
Top 1 Top 5 Top 1 Top 5 Top 1 Top 5
DPN-68 61 MB 2.5 61.27 85.46 61.54 85.99 62.35 86.20
DPN-92 184 MB 6.5 67.31 89.49 66.84 89.38 67.42 89.76

Note: The higher model complexity comes from the final classifier. Models trained on ImageNet-5k learn much richer feature representation than models trained on ImageNet-1k.

Efficiency (Training)

The training speed is tested based on MXNet @92053bd.

Multiple Nodes (Without specific code optimization):

Model CUDA
/cuDNN
#Node GPU Card
(per node)
Batch Size
(per GPU)
kvstore GPU Mem
(per GPU)
Training Speed*
(per node)
DPN-68 8.0 / 5.1 10 4 x K80 (Tesla) 64 dist_sync 9337 MiB 284 img/sec
DPN-92 8.0 / 5.1 10 4 x K80 (Tesla) 32 dist_sync 8017 MiB 133 img/sec
DPN-98 8.0 / 5.1 10 4 x K80 (Tesla) 32 dist_sync 11128 MiB 85 img/sec
DPN-131 8.0 / 5.1 10 4 x K80 (Tesla) 24 dist_sync 11448 MiB 60 img/sec
DPN-107 8.0 / 5.1 10 4 x K80 (Tesla) 24 dist_sync 12086 MiB 55 img/sec

*This is the actual training speed, which includes data augmentation, forward, backward, parameter update, network communication, etc. MXNet is awesome, we observed a linear speedup as has been shown in link

Trained Models

Model Size Dataset MXNet Model
DPN-68 49 MB ImageNet-1k GoogleDrive
DPN-68* 49 MB ImageNet-1k GoogleDrive
DPN-68 61 MB ImageNet-5k GoogleDrive
DPN-92 145 MB ImageNet-1k GoogleDrive
DPN-92 138 MB Places365-Standard GoogleDrive
DPN-92* 145 MB ImageNet-1k GoogleDrive
DPN-92 184 MB ImageNet-5k GoogleDrive
DPN-98 236 MB ImageNet-1k GoogleDrive
DPN-131 304 MB ImageNet-1k GoogleDrive
DPN-107* 333 MB ImageNet-1k GoogleDrive

*Pretrained on ImageNet-5k and then fine-tuned on ImageNet-1k.

Third-party Implementations

Other Resources

ImageNet-1k Trainig/Validation List:

ImageNet-1k category name mapping table:

ImageNet-5k Raw Images:

  • The ImageNet-5k is a subset of ImageNet10K provided by this paper.
  • Please download the ImageNet10K and then extract the ImageNet-5k by the list below.

ImageNet-5k Trainig/Validation List:

  • It contains about 5k leaf categories from ImageNet10K. There is no category overlapping between our provided ImageNet-5k and the official ImageNet-1k.
  • Download link: [GoogleDrive: https://goo.gl/kNZC4j]
  • Download link: GoogleDrive
  • Mapping Table: GoogleDrive

Places365-Standard Validation List & Matlab code for 10 crops testing:

Citation

If you use DPN in your research, please cite the paper:

@article{Chen2017,
  title={Dual Path Networks},
  author={Yunpeng Chen, Jianan Li, Huaxin Xiao, Xiaojie Jin, Shuicheng Yan, Jiashi Feng},
  journal={arXiv preprint arXiv:1707.01629},
  year={2017}
}

dpns's People

Contributors

cypw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dpns's Issues

ImageNet-5k data

Hi, yunpeng. I am trying to prepare ImageNet-5k training data by your provided train.lst.
I have prepared the ImageNet-10k, and I found that many images which in your train.lst are not included in ImageNet-10k dataset.
Such as:
IOError: [Errno 2] No such file or directory: '/home/datasets/Dataset/imagenet10k/n02399000/n02399000_5702.JPEG'

Would you mind sharing more informations about the preparation of ImageNet-5k data?

Finetune on new dataset

Hi, I am trying to finetune DPN-107 on a new dataset. I use latest mxnet and add scale=0.0167 in image iter. However, the training accuracy is very low. The resnext 101 model can reach 80+ while dpn only 40+. I have verified that using latest mxnet and scale=0.0167 can get TOP-1 ~95.0 on imagenet validation. So it's very stange why finetuning DPN on new dataset is not working well. I also tried to fix all layers except the last fc for classification. The performance is also very low. Do you have any comment on how to finetune DPN on new dataset? Thanks.

The op of slice_axis?

hi, I have check the json file of your model, and I found that there is a slice-axis op,but I can't find the implementation of this operation in mxnet/src/operator

How should I varify the trained model by run_val.sh?

I have seen the code that the validated datavaridat is set by the parameter of ‘--data-val'. And the default value is '/tmp/val.rec‘. But I don't know what and where this file is. And if I want to test a image, how should I do?
Thanks very much!

batch normalization layer

I noticed that in model json files, there are not "moving_mean" and "moving_variance" in BatchNorm layers. Can you explain why? Thx.

How to train DPNs

I try to train DPNs by modifying the score.py, but it doesn't work, the Train-accuracy is always 0.

DPN-98 on Place365-Standard dataset

Hi,
Thank for your sharing. But I want to use DPn-98 or Dpn-92 pretrained model on Place365-Standard dataset,
Can you give me a site?
Thank you very much

The order of Blocks: BN-Act-Conv2d or Conv2d-BN-Act?

Hello,
Your implementation uses the Micro-block as: BN-Act-Conv2d.
However, the ResNeXt uses the micro-block structure: Conv2d-BN-Act.
So between the two implementations, the Conv2d is missing at the first block.

Reading your paper, if I understand correctly, the implementation should follow the ResNeXt style
, such as implemented by Titu1994

Can you help to clarify the difference (if any)? Thanks for your help.

about inference image shape

In scope.py, the shape of input image is [3, 320, 320] during inference step, i want to know how did you do it?
I saw you use ImageRecorderIter to preprocess the input images, does it just resize the image into [3, 320, 320], or some other operations?
Thx for your help!~~

some question about hyper parameter

Hi! Thanks for your impressive work! I'm trying to remake your results. Would you share more information about your hyper parameters, especially the steps for learning rate which I can not find any information about the value in detail.

"Mean-Max pooling"

"Please let me know if any other resarchers have proposed exactly the same technique."

you may want to search about "mix pooling". some examples are:

[1] "Mixed Pooling for Convolutional Neural Networks"- Dingjun Yu, Hanli Wang, Peiqiu Chen, and Zhihua Wei

[2] "Generalizing Pooling Functions in Convolutional Neural Networks: Mixed, Gated, and Tree" - Chen-Yu Lee, Patrick W. Gallagher, Zhuowen Tu

caffe model?

Hi,
Does DPNs has the trained model which based on Caffe?

OSError

hi, when i run score.py, there is a such mistake:
Traceback (most recent call last):
File "D:/PycharmProjects/DPNs-master/score.py", line 110, in
speed = score(metrics=metrics, **vars(args))
File "D:/PycharmProjects/DPNs-master/score.py", line 42, in score
inter_method=2 # bicubic
File "D:\Python35\lib\site-packages\mxnet\io.py", line 725, in creator
ctypes.byref(iter_handle)))
OSError: exception: access violation reading 0x0000000000000090
can you help me ,thanks a lot.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.