Did you compare AP and FPS (rather than BFLOPS) of models ?

Cross Stage Partial Networks

This is the implementation of "CSPNet: A New Backbone that can Enhance Learning Capability of CNN" using Darknet framwork.

For installing Darknet framework, you can refer to darknet(AlexeyAB).

Combining with CIoU, Scale Sensitivity, IoU Threshold, Greedy NMS, Mosaic Augmentation, ...

CSPResNeXt-50-PANet-SPP acheives impressive results on test-dev set of MSCOCO object detection task:

Model	Size	fps	AP	AP50	AP75	APS	APM	APL	cfg	weight
CSPResNeXt50-PANet-SPP(SAM)	512×512	-	42.7	64.6	46.3	23.7	46.1	55.3	-	-
CSPResNeXt50-PANet-SPP(SAM)	608×608	-	43.2	65.4	47.1	26.1	46.7	53.2	-	-
CSPResNeXt50-PANet-SPP(GIoU)	512×512	-	42.4	64.4	45.9	23.3	45.9	55.0	-	-
CSPResNeXt50-PANet-SPP(GIoU)	608×608	-	43.1	65.4	47.0	26.0	46.9	52.8	-	-
CSPResNeXt50-PANet-SPP	512×512	44(1080ti) 67(GV100)	42.4	64.4	45.9	23.2	45.5	55.3	cfg	weight
CSPResNeXt50-PANet-SPP	608×608	35(1080ti) 44(GV100)	43.2	65.4	47.0	25.7	46.7	53.3	cfg	weight
CSPDarknet53-PANet-SPP	512×512	51(1080ti)	42.4	64.5	46.0	23.9	45.6	54.2	cfg	weight

ImageNet

Big Models

Model	#Parameter	BFLOPs	Top-1	Top-5	cfg	weight

DarkNet-53 [1]	41.57M	18.57	77.2	93.8	cfg	weight
CSPDarkNet-53	27.61M (-34%)	13.07 (-30%)	77.2 (=)	93.6 (-0.2)	cfg	weight
CSPDarkNet-53-Elastic	-	7.74 (-58%)	76.1 (-1.1)	93.3 (-0.5)	cfg	weight

ResNet-50 [2]	22.73M	9.74	75.8	92.9	cfg	weight
CSPResNet-50	21.57M (-5%)	8.97 (-8%)	76.6 (+0.8)	93.3 (+0.4)	cfg	weight
CSPResNet-50-Elastic	-	9.36 (-4%)	76.8 (+1.0)	93.5 (+0.6)	cfg	weight

ResNeXt-50 [3]	22.19M	10.11	77.8	94.2	cfg	weight
CSPResNeXt-50	20.50M (-8%)	7.93 (-22%)	77.9 (+0.1)	94.0 (-0.2)	cfg	weight
CSPResNeXt-50-Elastic	-	5.45 (-46%)	77.2 (-0.6)	93.8 (-0.4)	cfg	weight
CSPResNeXt-50+Elastic	-	7.82 (-23%)	78.2 (+0.4)	94.2 (=)	-	-
HarDNet-138s [4]	35.5M	13.4	77.8	-	-	-
DenseNet-264-32 [5]	27.21M	11.03	77.8	93.9	-	-
ResNet-152 [2]	60.2M	22.6	77.8	93.6	-	-

DenseNet-201+Elastic [6]	19.48M	8.77	77.9	94.0	-	-
CSPDenseNet-201+Elastic	20.17M (+4%)	7.13 (-19%)	77.9 (=)	94.0 (=)	-	-

Res2NetLite-72 [7]	-	5.19	74.7	92.1	cfg	weight

Small Models

Model	#Parameter	BFLOPs	Top-1	Top-5	cfg	weight

PeleeNet [8]	2.79M	1.017	70.7	90.0	-	-
PeleeNet-swish	2.79M	1.017	71.5	90.7	-	-
PeleeNet-swish-SE	2.81M	1.017	72.1	91.0	-	-
CSPPeleeNet	2.83M (+1%)	0.888 (-13%)	70.9 (+0.2)	90.2 (+0.2)	-	-
CSPPeleeNet-swish	2.83M (+1%)	0.888 (-13%)	71.7 (+0.2)	90.8 (+0.1)	-	-
CSPPeleeNet-swish-SE	2.85M (+1%)	0.888 (-13%)	72.4 (+0.3)	91.0 (=)	-	-
SparsePeleeNet [9]	2.39M	0.904	69.6	89.3	-	-

EfficientNet-B0* [10]	4.81M	0.915	71.3	90.4	cfg	weight
EfficientNet-B0 (official) [10]	-	-	70.0	88.9	-	-

MobileNet-v2 [11]	3.47M	0.858	67.0	87.7	cfg	weight
CSPMobileNet-v2	2.51M (-28%)	0.764 (-11%)	67.7 (+0.7)	88.3 (+0.6)	cfg	weight

Darknet Ref. [12]	7.31M	0.96	61.1	83.0	cfg	weight
CSPDenseNet Ref.	3.48M (-52%)	0.886 (-8%)	65.7 (+4.6)	86.6 (+3.6)	-	-
CSPPeleeNet Ref.	4.10M (-44%)	1.103 (+15%)	68.9 (+7.8)	88.7 (+5.7)	-	-
CSPDenseNetb Ref.	1.38M (-81%)	0.631 (-34%)	64.2 (+3.1)	85.5 (+2.5)	-	-
CSPPeleeNetb Ref.	2.01M (-73%)	0.897 (-7%)	67.8 (+6.7)	88.1 (+5.1)	-	-

ResNet-10 [2]	5.24M	2.273	63.5	85.0	cfg	weight
CSPResNet-10	2.73M (-48%)	1.905 (-16%)	65.3 (+1.8)	86.5 (+1.5)	-	-

MixNet-M-GPU	-	1.065	71.5	90.5	-	-

※EfficientNet* is implemented by Darknet framework.

※EfficientNet(official) is trained by official code with batch size equals to 256.

※Swish activation function is presented by [13].

※Squeeze-and-excitation (SE) network is presented by [14].

※MixNet-M-GPU is modified from MixNet-M [21]

Some tricks for improving Acc

Activation function

Model	Activation	Top-1	Top-5
PeleeNet	LReLU	70.7	90.0
PeleeNet	Swish	71.5 (+0.8)	90.7 (+0.7)
PeleeNet	Mish	71.4 (+0.7)	90.4 (+0.4)

CSPPeleeNet	LReLU	70.9	90.2
CSPPeleeNet	Swish	71.7 (+0.8)	90.8 (+0.6)
CSPPeleeNet	Mish	71.2 (+0.3)	90.3 (+0.1)

CSPResNeXt-50	LReLU	77.9	94.0
CSPResNeXt-50	Mish	78.9 (+1.0)	94.5 (+0.5)

※Swish activation function is not suitable for ResNeXt-based models, details are shown in Mish paper [22].

Data augmentation

Model	Augmentation	Top-1	Top-5
CSPResNeXt-50	Normal	77.9	94.0
CSPResNeXt-50	Mixup	77.2	94.0
CSPResNeXt-50	Cutmix	78.0	94.3
CSPResNeXt-50	Cutmix+Mixup	77.7	94.4
CSPResNeXt-50	Mosaic	78.1	94.5
CSPResNeXt-50	Blur	77.5	93.8

※Mixup is presented by [23] and used by [24].

※CutMix is presented by [25].

※~~Have to check the implementation of mixup and cutmix~~.

Other

Model	Method	Top-1	Top-5
CSPResNeXt-50	Normal	77.9	94.0
CSPResNeXt-50	Smooth	78.1	94.4

※Smooth means label smoothing, which is presented by [26].

MS COCO

GPU Real-time Models

Model	Size	1080ti fps	AP	AP50	AP75	cfg	weight
CSPResNeXt50-PANet-SPP	512×512	44	38.0	60.0	40.8	cfg	weight
CSPDarknet53-PANet-SPP	512×512	51	38.7	61.3	41.7	cfg	weight
CSPResNet50-PANet-SPP	512×512	55	38.0	60.5	40.7	cfg	weight

※PANet is presented by [15].

※SPP is presented by [16].

CPU Real-time Models

Model	Size	9900K fps	AP	AP50	AP75	cfg	weight
YOLOv3-tiny [1]	416×416	54	-	33.1	-	cfg	weight
YOLOv3-tiny-PRN [18]	416×416	71	-	33.1	-	cfg	weight
SNet49-ThunderNet* [19]	320×320	47	19.1	33.7	19.6	-	-
Ours	320×320	102	15.3	34.2	12.0	-	-

SNet146-ThunderNet* [19]	320×320	32	23.6	40.2	24.5	-	-
Ours	320×320	52	19.4	40.0	17.0	-	-

Pelee** [7]	304×304	7	22.4	38.3	22.9	-	-
RefineDetLite** [20]	320×320	8	26.8	46.6	27.4	-	-

※SNet49-ThunderNet* and SNet146-ThunderNet* are test on Xeon E5-2682v4.

※Pelee** and RefineDetLite** are test on i7-6700.

Some tricks for improving AP

NMS threshold

Model	Size	Threshold	AP	AP50	AP75	APS	APM	APL
CSPResNeXt50-PANet-SPP	512×512	0.45	38.0	60.0	40.8	19.7	41.4	49.9
CSPResNeXt50-PANet-SPP	512×512	0.50	38.2	60.2	41.1	19.8	41.6	50.1
CSPResNeXt50-PANet-SPP	512×512	0.55	38.4	60.1	41.3	20.0	41.7	50.3
CSPResNeXt50-PANet-SPP	512×512	0.60	38.5	60.0	41.7	20.1	41.9	50.4
CSPResNeXt50-PANet-SPP	512×512	0.65	38.6	59.7	42.1	20.1	41.9	50.4
CSPResNeXt50-PANet-SPP	512×512	0.70	38.5	59.2	42.4	20.1	41.9	50.4

CSPResNeXt50-PANet-SPP-GIoU	512×512	0.45	39.4	59.4	42.5	20.4	42.6	51.4
CSPResNeXt50-PANet-SPP-GIoU	512×512	0.50	39.7	59.5	42.7	20.5	42.5	51.7
CSPResNeXt50-PANet-SPP-GIoU	512×512	0.55	39.8	59.5	43.0	20.7	43.1	51.9
CSPResNeXt50-PANet-SPP-GIoU	512×512	0.60	40.0	59.3	43.4	20.8	43.2	52.0
CSPResNeXt50-PANet-SPP-GIoU	512×512	0.65	40.1	59.0	43.8	20.9	43.4	52.1
CSPResNeXt50-PANet-SPP-GIoU	512×512	0.70	40.1	58.6	44.2	20.9	43.4	52.1
CSPResNeXt50-PANet-SPP-GIoU	512×512	aware	40.0	59.5	43.4	20.8	43.2	52.0

※GIoU is presented by [17].

Activation function

Model	Size	Activation	AP	AP50	AP75	APS	APM	APL
CSPPeleeNet-PRN	416×416	Leaky ReLU	23.1	44.5	22.0	6.6	24.4	35.3
CSPPeleeNet-PRN	416×416	Swish	24.1	45.8	23.3	6.8	26.1	35.5

Loss function

Model	Size	Loss	AP	AP50	AP75	APS	APM	APL
CSPResNeXt50-PANet-SPP	512×512	MSE	38.0	60.0	40.8	19.7	41.4	49.9
CSPResNeXt50-PANet-SPP	512×512	GIoU	39.4	59.4	42.5	20.4	42.6	51.4
CSPResNeXt50-PANet-SPP	512×512	DIoU	39.1	58.8	42.1	20.1	42.4	50.7
CSPResNeXt50-PANet-SPP	512×512	CIoU	39.6	59.2	42.6	20.5	42.9	51.6

※DIoU and CIoU are presented by [27].

Citation

@inproceedings{wang2020cspnet,
  title={CSPNet: A new backbone that can enhance learning capability of cnn},
  author={Wang, Chien-Yao and Mark Liao, Hong-Yuan and Wu, Yueh-Hua and Chen, Ping-Yang and Hsieh, Jun-Wei and Yeh, I-Hau},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  pages={390--391},
  year={2020}
}

Reference

[1] YOLOv3: An Incremental Improvement

[2] Deep Residual Learning for Image Recognition (CVPR 2016)

[3] Aggregated Residual Transformations for Deep Neural Networks (CVPR 2017)

[4] HarDNet: A Low Memory Traffic Network (ICCV 2019)

[5] Densely Connected Convolutional Networks (CVPR 2017)

[6] ELASTIC: Improving CNNs with Dynamic Scaling Policies (CVPR 2019)

[7] RefineDetLite: A Lightweight One-stage Object Detection Framework for CPU-only Devices

[8] Pelee: A Real-Time Object Detection System on Mobile Devices (NeurIPS 2018)

[9] Sparsely Aggregated Convolutional Networks (ECCV 2018)

[10] EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (ICML 2019)

[11] MobileNetV2: Inverted Residuals and Linear Bottlenecks (CVPR 2018)

[12] https://pjreddie.com/darknet/tiny-darknet/

[13] Searching for Activation Functions

[14] Squeeze-and-Excitation Networks (CVPR 2018)

[15] Path Aggregation Network for Instance Segmentation (CVPR 2018)

[16] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition (TPAMI 2015)

[17] Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression (CVPR 2019)

[18] Enriching Variety of Layer-wise Learning Information by Gradient Combination (ICCVW 2019)

[19] ThunderNet: Towards Real-time Generic Object Detection (ICCV 2019)

[20] RefineDetLite: A Lightweight One-stage Object Detection Framework for CPU-only Devices

[21] MixConv: Mixed Depthwise Convolutional Kernels

[22] Mish: A Self Regularized Non-Monotonic Neural Activation Function

[23] mixup: Beyond Empirical Risk Minimization (ICLR 2018)

[24] Bag of Freebies for Training Object Detection Neural Networks

[25] CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features (ICCV 2019)

[26] Rethinking the Inception Architecture for Computer Vision (CVPR 2016)

[27] Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression (AAAI 2020)

Acknowledgements

https://github.com/AlexeyAB/darknet

https://github.com/ultralytics/yolov3

	[convolutional]
	share_index=115
	batch_normalize=1
	size=3
	stride=1
	pad=1
	filters=256
	activation=leaky

Model	GPU	256x256	512x512	608x608
darknet53.cfg (original)	RTX 2070	113	56	38
csdarknet53.cfg (original)	RTX 2070	101	57	41
csdarknet53g.cfg.txt	RTX 2070	122	64	46
csdarknet53ghr.cfg.txt	RTX 2070	100	75	57
spinenet49.cfg.txt low priority	RTX 2070	49	44	43
csdarknet19-fast.cfg.txt	RTX 2070	213	149	116

Model	Size	NMS	1080ti fps	BFLOPs	AP	AP50	AP75	cfg	weight
CSPResNeXt50-PANet-SPP	512×512	0.5	44	71.331	39.2	59.5	41.8	cfg	-
CSPResNeXt50c-PANet-SPP	512×512	0.5	-	71.734	39.9	60.1	42.6	cfg	-

wongkinyiu / crossstagepartialnetworks Goto Github PK

crossstagepartialnetworks's Introduction

Cross Stage Partial Networks

ImageNet

Big Models

Small Models

Some tricks for improving Acc

MS COCO

GPU Real-time Models

CPU Real-time Models

Some tricks for improving AP

Citation

Reference

Acknowledgements

crossstagepartialnetworks's People

Contributors

Stargazers

Watchers

Forkers

crossstagepartialnetworks's Issues

Testing

Training

1-1

Recommend Projects

Recommend Topics

Recommend Org