Giter Club home page Giter Club logo

crossstagepartialnetworks's Introduction

Cross Stage Partial Networks

This is the implementation of "CSPNet: A New Backbone that can Enhance Learning Capability of CNN" using Darknet framwork.

For installing Darknet framework, you can refer to darknet(AlexeyAB).

Combining with CIoU, Scale Sensitivity, IoU Threshold, Greedy NMS, Mosaic Augmentation, ...

CSPResNeXt-50-PANet-SPP acheives impressive results on test-dev set of MSCOCO object detection task:

Model Size fps AP AP50 AP75 APS APM APL cfg weight
CSPResNeXt50-PANet-SPP(SAM) 512×512 - 42.7 64.6 46.3 23.7 46.1 55.3 - -
CSPResNeXt50-PANet-SPP(SAM) 608×608 - 43.2 65.4 47.1 26.1 46.7 53.2 - -
CSPResNeXt50-PANet-SPP(GIoU) 512×512 - 42.4 64.4 45.9 23.3 45.9 55.0 - -
CSPResNeXt50-PANet-SPP(GIoU) 608×608 - 43.1 65.4 47.0 26.0 46.9 52.8 - -
CSPResNeXt50-PANet-SPP 512×512 44(1080ti) 67(GV100) 42.4 64.4 45.9 23.2 45.5 55.3 cfg weight
CSPResNeXt50-PANet-SPP 608×608 35(1080ti) 44(GV100) 43.2 65.4 47.0 25.7 46.7 53.3 cfg weight
CSPDarknet53-PANet-SPP 512×512 51(1080ti) 42.4 64.5 46.0 23.9 45.6 54.2 cfg weight

ImageNet

Big Models

Model #Parameter BFLOPs Top-1 Top-5 cfg weight
DarkNet-53 [1] 41.57M 18.57 77.2 93.8 cfg weight
CSPDarkNet-53 27.61M (-34%) 13.07 (-30%) 77.2 (=) 93.6 (-0.2) cfg weight
CSPDarkNet-53-Elastic - 7.74 (-58%) 76.1 (-1.1) 93.3 (-0.5) cfg weight
ResNet-50 [2] 22.73M 9.74 75.8 92.9 cfg weight
CSPResNet-50 21.57M (-5%) 8.97 (-8%) 76.6 (+0.8) 93.3 (+0.4) cfg weight
CSPResNet-50-Elastic - 9.36 (-4%) 76.8 (+1.0) 93.5 (+0.6) cfg weight
ResNeXt-50 [3] 22.19M 10.11 77.8 94.2 cfg weight
CSPResNeXt-50 20.50M (-8%) 7.93 (-22%) 77.9 (+0.1) 94.0 (-0.2) cfg weight
CSPResNeXt-50-Elastic - 5.45 (-46%) 77.2 (-0.6) 93.8 (-0.4) cfg weight
CSPResNeXt-50+Elastic - 7.82 (-23%) 78.2 (+0.4) 94.2 (=) - -
HarDNet-138s [4] 35.5M 13.4 77.8 - - -
DenseNet-264-32 [5] 27.21M 11.03 77.8 93.9 - -
ResNet-152 [2] 60.2M 22.6 77.8 93.6 - -
DenseNet-201+Elastic [6] 19.48M 8.77 77.9 94.0 - -
CSPDenseNet-201+Elastic 20.17M (+4%) 7.13 (-19%) 77.9 (=) 94.0 (=) - -
Res2NetLite-72 [7] - 5.19 74.7 92.1 cfg weight

Small Models

Model #Parameter BFLOPs Top-1 Top-5 cfg weight
PeleeNet [8] 2.79M 1.017 70.7 90.0 - -
PeleeNet-swish 2.79M 1.017 71.5 90.7 - -
PeleeNet-swish-SE 2.81M 1.017 72.1 91.0 - -
CSPPeleeNet 2.83M (+1%) 0.888 (-13%) 70.9 (+0.2) 90.2 (+0.2) - -
CSPPeleeNet-swish 2.83M (+1%) 0.888 (-13%) 71.7 (+0.2) 90.8 (+0.1) - -
CSPPeleeNet-swish-SE 2.85M (+1%) 0.888 (-13%) 72.4 (+0.3) 91.0 (=) - -
SparsePeleeNet [9] 2.39M 0.904 69.6 89.3 - -
EfficientNet-B0* [10] 4.81M 0.915 71.3 90.4 cfg weight
EfficientNet-B0 (official) [10] - - 70.0 88.9 - -
MobileNet-v2 [11] 3.47M 0.858 67.0 87.7 cfg weight
CSPMobileNet-v2 2.51M (-28%) 0.764 (-11%) 67.7 (+0.7) 88.3 (+0.6) cfg weight
Darknet Ref. [12] 7.31M 0.96 61.1 83.0 cfg weight
CSPDenseNet Ref. 3.48M (-52%) 0.886 (-8%) 65.7 (+4.6) 86.6 (+3.6) - -
CSPPeleeNet Ref. 4.10M (-44%) 1.103 (+15%) 68.9 (+7.8) 88.7 (+5.7) - -
CSPDenseNetb Ref. 1.38M (-81%) 0.631 (-34%) 64.2 (+3.1) 85.5 (+2.5) - -
CSPPeleeNetb Ref. 2.01M (-73%) 0.897 (-7%) 67.8 (+6.7) 88.1 (+5.1) - -
ResNet-10 [2] 5.24M 2.273 63.5 85.0 cfg weight
CSPResNet-10 2.73M (-48%) 1.905 (-16%) 65.3 (+1.8) 86.5 (+1.5) - -
MixNet-M-GPU - 1.065 71.5 90.5 - -

※EfficientNet* is implemented by Darknet framework.

※EfficientNet(official) is trained by official code with batch size equals to 256.

※Swish activation function is presented by [13].

※Squeeze-and-excitation (SE) network is presented by [14].

※MixNet-M-GPU is modified from MixNet-M [21]

Some tricks for improving Acc

  1. Activation function
Model Activation Top-1 Top-5
PeleeNet LReLU 70.7 90.0
PeleeNet Swish 71.5 (+0.8) 90.7 (+0.7)
PeleeNet Mish 71.4 (+0.7) 90.4 (+0.4)
CSPPeleeNet LReLU 70.9 90.2
CSPPeleeNet Swish 71.7 (+0.8) 90.8 (+0.6)
CSPPeleeNet Mish 71.2 (+0.3) 90.3 (+0.1)
CSPResNeXt-50 LReLU 77.9 94.0
CSPResNeXt-50 Mish 78.9 (+1.0) 94.5 (+0.5)

※Swish activation function is not suitable for ResNeXt-based models, details are shown in Mish paper [22].

  1. Data augmentation
Model Augmentation Top-1 Top-5
CSPResNeXt-50 Normal 77.9 94.0
CSPResNeXt-50 Mixup 77.2 94.0
CSPResNeXt-50 Cutmix 78.0 94.3
CSPResNeXt-50 Cutmix+Mixup 77.7 94.4
CSPResNeXt-50 Mosaic 78.1 94.5
CSPResNeXt-50 Blur 77.5 93.8

※Mixup is presented by [23] and used by [24].

※CutMix is presented by [25].

Have to check the implementation of mixup and cutmix.

  1. Other
Model Method Top-1 Top-5
CSPResNeXt-50 Normal 77.9 94.0
CSPResNeXt-50 Smooth 78.1 94.4

※Smooth means label smoothing, which is presented by [26].

MS COCO

GPU Real-time Models

Model Size 1080ti fps AP AP50 AP75 cfg weight
CSPResNeXt50-PANet-SPP 512×512 44 38.0 60.0 40.8 cfg weight
CSPDarknet53-PANet-SPP 512×512 51 38.7 61.3 41.7 cfg weight
CSPResNet50-PANet-SPP 512×512 55 38.0 60.5 40.7 cfg weight

※PANet is presented by [15].

※SPP is presented by [16].

CPU Real-time Models

Model Size 9900K fps AP AP50 AP75 cfg weight
YOLOv3-tiny [1] 416×416 54 - 33.1 - cfg weight
YOLOv3-tiny-PRN [18] 416×416 71 - 33.1 - cfg weight
SNet49-ThunderNet* [19] 320×320 47 19.1 33.7 19.6 - -
Ours 320×320 102 15.3 34.2 12.0 - -
SNet146-ThunderNet* [19] 320×320 32 23.6 40.2 24.5 - -
Ours 320×320 52 19.4 40.0 17.0 - -
Pelee** [7] 304×304 7 22.4 38.3 22.9 - -
RefineDetLite** [20] 320×320 8 26.8 46.6 27.4 - -

※SNet49-ThunderNet* and SNet146-ThunderNet* are test on Xeon E5-2682v4.

※Pelee** and RefineDetLite** are test on i7-6700.

Some tricks for improving AP

  1. NMS threshold
Model Size Threshold AP AP50 AP75 APS APM APL
CSPResNeXt50-PANet-SPP 512×512 0.45 38.0 60.0 40.8 19.7 41.4 49.9
CSPResNeXt50-PANet-SPP 512×512 0.50 38.2 60.2 41.1 19.8 41.6 50.1
CSPResNeXt50-PANet-SPP 512×512 0.55 38.4 60.1 41.3 20.0 41.7 50.3
CSPResNeXt50-PANet-SPP 512×512 0.60 38.5 60.0 41.7 20.1 41.9 50.4
CSPResNeXt50-PANet-SPP 512×512 0.65 38.6 59.7 42.1 20.1 41.9 50.4
CSPResNeXt50-PANet-SPP 512×512 0.70 38.5 59.2 42.4 20.1 41.9 50.4
CSPResNeXt50-PANet-SPP-GIoU 512×512 0.45 39.4 59.4 42.5 20.4 42.6 51.4
CSPResNeXt50-PANet-SPP-GIoU 512×512 0.50 39.7 59.5 42.7 20.5 42.5 51.7
CSPResNeXt50-PANet-SPP-GIoU 512×512 0.55 39.8 59.5 43.0 20.7 43.1 51.9
CSPResNeXt50-PANet-SPP-GIoU 512×512 0.60 40.0 59.3 43.4 20.8 43.2 52.0
CSPResNeXt50-PANet-SPP-GIoU 512×512 0.65 40.1 59.0 43.8 20.9 43.4 52.1
CSPResNeXt50-PANet-SPP-GIoU 512×512 0.70 40.1 58.6 44.2 20.9 43.4 52.1
CSPResNeXt50-PANet-SPP-GIoU 512×512 aware 40.0 59.5 43.4 20.8 43.2 52.0

※GIoU is presented by [17].

  1. Activation function
Model Size Activation AP AP50 AP75 APS APM APL
CSPPeleeNet-PRN 416×416 Leaky ReLU 23.1 44.5 22.0 6.6 24.4 35.3
CSPPeleeNet-PRN 416×416 Swish 24.1 45.8 23.3 6.8 26.1 35.5
  1. Loss function
Model Size Loss AP AP50 AP75 APS APM APL
CSPResNeXt50-PANet-SPP 512×512 MSE 38.0 60.0 40.8 19.7 41.4 49.9
CSPResNeXt50-PANet-SPP 512×512 GIoU 39.4 59.4 42.5 20.4 42.6 51.4
CSPResNeXt50-PANet-SPP 512×512 DIoU 39.1 58.8 42.1 20.1 42.4 50.7
CSPResNeXt50-PANet-SPP 512×512 CIoU 39.6 59.2 42.6 20.5 42.9 51.6

※DIoU and CIoU are presented by [27].

Citation

@inproceedings{wang2020cspnet,
  title={CSPNet: A new backbone that can enhance learning capability of cnn},
  author={Wang, Chien-Yao and Mark Liao, Hong-Yuan and Wu, Yueh-Hua and Chen, Ping-Yang and Hsieh, Jun-Wei and Yeh, I-Hau},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  pages={390--391},
  year={2020}
}

Reference

[1] YOLOv3: An Incremental Improvement

[2] Deep Residual Learning for Image Recognition (CVPR 2016)

[3] Aggregated Residual Transformations for Deep Neural Networks (CVPR 2017)

[4] HarDNet: A Low Memory Traffic Network (ICCV 2019)

[5] Densely Connected Convolutional Networks (CVPR 2017)

[6] ELASTIC: Improving CNNs with Dynamic Scaling Policies (CVPR 2019)

[7] RefineDetLite: A Lightweight One-stage Object Detection Framework for CPU-only Devices

[8] Pelee: A Real-Time Object Detection System on Mobile Devices (NeurIPS 2018)

[9] Sparsely Aggregated Convolutional Networks (ECCV 2018)

[10] EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (ICML 2019)

[11] MobileNetV2: Inverted Residuals and Linear Bottlenecks (CVPR 2018)

[12] https://pjreddie.com/darknet/tiny-darknet/

[13] Searching for Activation Functions

[14] Squeeze-and-Excitation Networks (CVPR 2018)

[15] Path Aggregation Network for Instance Segmentation (CVPR 2018)

[16] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition (TPAMI 2015)

[17] Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression (CVPR 2019)

[18] Enriching Variety of Layer-wise Learning Information by Gradient Combination (ICCVW 2019)

[19] ThunderNet: Towards Real-time Generic Object Detection (ICCV 2019)

[20] RefineDetLite: A Lightweight One-stage Object Detection Framework for CPU-only Devices

[21] MixConv: Mixed Depthwise Convolutional Kernels

[22] Mish: A Self Regularized Non-Monotonic Neural Activation Function

[23] mixup: Beyond Empirical Risk Minimization (ICLR 2018)

[24] Bag of Freebies for Training Object Detection Neural Networks

[25] CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features (ICCV 2019)

[26] Rethinking the Inception Architecture for Computer Vision (CVPR 2016)

[27] Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression (AAAI 2020)

Acknowledgements

https://github.com/AlexeyAB/darknet

https://github.com/ultralytics/yolov3

crossstagepartialnetworks's People

Contributors

wongkinyiu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

crossstagepartialnetworks's Issues

Did you compare AP and FPS (rather than BFLOPS) of models ?

@WongKinYiu Hi,

Did you compare AP (MS COCO) and 1080ti fps (rather than BFLOPS) of models ?

With the same: network resolution, mini_batch=batch/subdivisions, data augmentation, the same dataset - to compare apples with apples.

the problem of using label_smooth_eps=0.1 for classify?

I used the csdarknet53-omega. cfg file to train the classification network(not detection) on my own dataset. When using label_smooth_eps =0.1, the loss increases gradually, no matter how much learning rate I set. But when not use label_smooth_eps=0.1, the loss gradually converges.I mean, why is that?
By the way, I have five categories with an average of 1,800 pictures in each category. How many training times and batch sizes should be set?

shortcut error in csresnext50-panet-spp-original-optimal.cfg

in the cfg file, the first shortcut layer cut is from -4, which is convolutional layer with filters=64, while the convolutional layer before this [shortcut] is filters=128, they cannot be added, is this an error?
[net]

Testing

#batch=1
#subdivisions=1

Training

batch=64
subdivisions=8
width=512
height=512
channels=3
momentum=0.949
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.00261
burn_in=1000
max_batches = 500500
policy=steps
steps=400000,450000
scales=.1,.1

#cutmix=1
mosaic=1

#19:104x104 38:52x52 65:26x26 80:13x13 for 416

[convolutional]
batch_normalize=1
filters=64
size=7
stride=2
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

1-1

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
groups=32
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

关于cspdarknet53在imagenet上的分类的训练结果

你好:
我看了你的csdarknet53.cfg,里面有如下配置:
angle=7
hue=.1
saturation=.75
exposure=.75
aspect=.75
所以这些增强在imagenet数据集上训练时也加进去了,才得到top1 acc:77%的结果吗?

关于cspdarknet53,cfg中weight_decay的设置

你好,我看你的配置文件cfg/csdarknet53.cfg里面decay=0.0005,但是你的cspnet论文里面写的weight_decay=0.005,包括yolov4论文里面也是写的weight_decay=0.005,实际训练cspdarknet53分类这个差别大吗?

Bad inference performance with CSPResNeXt50-PANet-SPP

Hi,

I've been inspecting CSPResNeXt50-PANet-SPP for human detection in real-time. According to readme file of this repository, CSPResNeXt50-PANet-SPP performs better than Yolov3 in AP on COCO dataset.

In order to verify this result, I downloaded cfg and weights of CSPResNeXt50-PANet-SPP to compare it with Yolov3 (yolov3.cfg + yolov3.weights - result of COCO training).

As far as I could observe, CSPResNeXt50-PANet-SPP is not better than Yolov3 at least for my case of detecting humans in video streams. Here is an example image of results of both networks:

  1. Inference result with CSPResNeXt50-PANet-SPP:
    CSPResNeXt50-PANet-SPP_detection

  2. Inference result with Yolov3:
    yolov3_detection

My question is that whether these images represent a special case where CSPResNeXt50-PANet-SPP may perform worse than Yolov3? For instance, maybe for small objects like humans in the given images? Or what is the best way to explain this status?

Thanks in advance.

cspMobilnet pytorch

I am trying to combine csp into mobilenet pytorch. But hard to read the darknet config of that model. Can you visualize this model in somes tool like netron?

FLOPs mismatch across different frameworks.

Hi @WongKinYiu

Thanks for your answer to my previous question. However, I found severe FLOPs mismatch in your paper and darknet paper @AlexeyAB .

For example, as for ResNet50, the original paper states that the FLOPs is around 3.8 to 4.0 G FLOPs. However, in your paper, it is 9.74 BFLOPs. So there is a big difference. I found maybe there is a ratio of 2.59 around them, right?

Gradient calculation in paper

Hi,
I am interested in CSPNet recently, and reading the paper: https://arxiv.org/pdf/1911.11929.pdf.
But I have a question about the gradient calculation in page 4, in the paper the gradient calculate as

w1' = f(w1, g0)
w2' = f(w2, g0, g1)
...
wk' = f(wk, g0, g1, g2, ..., gk-1)

Don't this part is calculated as this?

w1' = f(w1, g0, g1, g2, ..., gk)
w2' = f(w2, g1, g2, ..., gk)
...
wk' = f(wk, gk)

also I want to confirm that if the definition of gi is the partial differential of error to weight? that is,

I was very confuse about this part, hope that you can help me.

how to split the feature in two

image
how to split the feature in two
in your paper
i see, the frist feture map was splitted in two, one for concat, and another for conv and copy.
But I didn't find any relevant information in your train cfg file: csresnext50-panet-spp-original-optimal.cfg
@WongKinYiu @AlexeyAB

question about training time

I read the paper that you train the model in single GPU. How long does it take to train on a single GPU? Why not train on multi-GPU?

question about inference time on batchsize 1 or multiple batchsizes

I use CSPDarknet as the backbone of my detection on mmdetection, but the inference time of CSPDarknet is slower than Darknet when batchsize is 1. However, when batchsize increases 8 or more, the inference time is quicker than Darknet.
Is this normal? or something wrong?

Thanks!

Problems understanding the concept of "csdarknet53s-panet-spp"

@WongKinYiu

Hello! Thanks for your great work! I want to use "csdarknet53s-panet-spp" and "yolov3-spp" in my master thesis. I understand yolov3-spp however I don't quite understand csdarknet53s-panet-spp.

  1. So first of all what's the difference between "csdarknet53s" and "csdarknet53m"?
  2. The detector is called "panet" however the cfg-file also has 3 yolo layers? So what's the difference between panet and yolo? Is it just that the augmented bottom-up pathway was added to the FPN architecture, that the adaptative feature pooling layer was added after this extension and that the backbone in general (CsDarknet) is different because it uses partitial dense blocks?
  3. Also the main difference between vanilla Darknet and CsDarknet is not quite clear to me. So is just that we use several partitial dense blocks in which we basically split the base feature map, concat one part with several feature maps and put the concatenation in the transition layer while the other part of the base feature map is then combined with the transitioned concatenation?

Also for training csdarknet53s-panet-spp we need "csdarknet53.conv.104" as you wrote in issue #17. However I can't find the file. Could you please link it?

different result between repos and preprint paper

In the preprint paper, I saw the difference result when test in the MSCOCO object detection task. For instance, with size 608x608 CSPResNeXt50-PANet-SPP had AP is 43.2 in the repos comparing with 38.4 in the paper. It's a big gapped. Could you explain me clearly?
Capture from ReadMe.md:
Screenshot from 2020-03-18 14-39-34
From preprint paper:
Screenshot from 2020-03-18 14-39-48

One more question, the input size of csresnext50-panet-spp-original-optimal.cfg is 512, the last yolo output sequently is 64,32,16 so when you getting optimal model on input size 608 it will 76,38,19, does you need modify model to get the best result?

yolov3-spp-matrix.cfg in ultralytics/yolov3 - missing 'share_index' field implementation

Hi @WongKinYiu thanks for the great work here! I introduced new error-checking in https://github.com/ultralytics/yolov3 for custom cfgs, and after running your yolov3-spp-matrix.cfg through, I just realized my implementation is not handling your 'share_index' field correctly. What do I do exactly with the value share_index=115 here?

AssertionError: Unsupported field 'share_index' in cfg/yolov3-spp-matrix.cfg. 

[convolutional]
share_index=115
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

Training never starts?

I used the csresnext50-panet-spp-original-optimal.cfg and the weight file associated with it. I followed the steps mentioned by @AlexeyAB's yolo repo and changed my cfg accordingly.
I changed all convolution filters before yolo to 18 as I have 1 class and reduced the max_batches to 10000. I also set the steps to 8000,9000.
But I can never start the training. It loads the model and just stops.
How do I fix this?

Need help in setting hyper-parameters

@WongKinYiu I have been trying to set right hyper parameters for yolov3-spp but for complete open-images dataset after 300-400 iterations server restarts I have previously trained with 3 of the classes among 601 classes but then I used single GPU parameters for multi GPU training when training for threee classes and dataset size is also small then like 1100 images or So. But now training on whole dataset with Multi GPU parameters causing system reboot,
BTW how do you calculate the parameters for multi-GPU you have already replied to me in previous issues on @AlexeyAB repo at the core how to set burn-in,learning rate, decay in cfg file.as narrated by alexy is causing issue so I changed almost all the parameters to single GPU config except burn-in even then problem persists
Screenshot from 2020-02-07 15-49-29
Screenshot from 2020-02-07 15-49-07
For the above hardware here is the link to config I'm using
Please help me out
Thanks

Can csmobilenetv2 use TensorRT to accelerate inference?

I have trained my own model csmobilenetv2(backbone) and yolov3_tiny(head) and got the best final weights. Now I want to accelerate the inference with TensorRT which verision is 5.1.6.1. But got errors when translating the weights model to onnx model. Useing yolov3.weights yolov3.cfg yolov3-tiny.weights yolov3-tiny.cfg are ok.

tensorrt:5.1.6.1
python:3.6.9
onnx:1.4.1
numpy:1.18.4

Traceback (most recent call last):
File "yolov3_to_onnx.py", line 840, in
main()
File "yolov3_to_onnx.py", line 827, in main
verbose=True)
File "yolov3_to_onnx.py", line 447, in build_onnx_graph
params)
File "yolov3_to_onnx.py", line 322, in load_conv_weights
conv_params, 'conv', 'weights')
File "yolov3_to_onnx.py", line 351, in _create_param_tensors
conv_params, param_category, suffix)
File "yolov3_to_onnx.py", line 383, in _load_one_param_type
buffer=self.weights_file.read(param_size * 4))
TypeError: buffer is too small for requested array

How to improve the model for custom data

@WongKinYiu Hi,

I have a problem that sometimes some pictures are not detected or detected wrong. I attached my model and some images for testing. Could you please check it and guide me? I have about 2K images per class. Please give me some information about the hyperparameters for my case.

file

Thanks in advance

Originally posted by @zpmmehrdad in #6 (comment)

[Question] How to inference on classifier model

Hi @WongKinYiu i want to ask a question. From pjreddie website : image classification model can be trained and inferenced using the darknet framework https://pjreddie.com/darknet/imagenet/, but is there any way to do inference of the classifier model by not using darknet framework? Like the object detection model can be inferenced by using opencvdnn or by converting it to another framework ? Can you kindly give me advice ? Thankyou so much before.

Mosaic Augmentation Paper?

@WongKinYiu @AlexeyAB great work on the README, there's a wealth of information here! I saw that the mosaic augmentation outperformed all of the rest in your readme tests, including more well known ones like cutmix.

Based on this I wonder if it's worth publishing a short paper to arxiv, and then we can link to this github, alexeyab/darknet and ultralytics/yolov3 for the mosaic dataloader code?

hardnet

Do you have a darknet hardnet implementation?

How many layers to extract from a .cfg ?

I want to retrain csmobilenet-v2.cfg for a different task starting with the provided weights.

How many layers should I extract from the provided weights?

There are 59 [convolutional] layers and since other layers like max-pooling and routes don't have weights, and since I am changing the number of convolutional filters immediately before the softmax, I guess that the answer is 58?

Like:

./darknet partial csmobilenet-v2.cfg csmobilenet-v2.weights csmobilenet-v2.conv.58 58

What's mean CIO?

What's mean CIO?I can't understand, please tell me the full name of CIO!

The figure about yolov4 and yolov5

Hello, I find this figure in the issue, which compares the detection results between different yolo.
84472297-8b0c7c80-acb9-11ea-91a9-49987d70bcfb
In the results of each yolov, there are several points. I wanna know the meanings of them.
Why does AP increase with average latency increases?

Try to train fast (grouped-conv) versions of csdarknet53 and csdarknet19

@WongKinYiu Hi,

Since CSPDarkNet53 is better than CSPResNeXt50 for Detector, try to train these 4 models:

Model GPU 256x256 512x512 608x608
darknet53.cfg (original) RTX 2070 113 56 38
csdarknet53.cfg (original) RTX 2070 101 57 41
csdarknet53g.cfg.txt RTX 2070 122 64 46
csdarknet53ghr.cfg.txt RTX 2070 100 75 57
spinenet49.cfg.txt low priority RTX 2070 49 44 43
csdarknet19-fast.cfg.txt RTX 2070 213 149 116

csdarknet19-fast.cfg contains DropBlock, so use the latest version of Darknet that uses fast random-functions for DropBlock.

How to use a cfg?

Hi,

I've been using Darknet (Alexey) for a while and I am familiar with how to configure a cfg for a custom training for object detection. This repository is suggested to me and the claimed results seem promising.

But, I could not find how to use a cfg for a training. For instance, I could not figure out how to adapt the cfg to number of classes I'd like to train for. In regular Darknet, one has to configure some parameters in Yolo layer (number of filters, classes etc) in order to adapt the cfg to a custom training.

Any help is appreciated

Training Steps Mismatch in the paper and the code in ImageNet Experiments

Hi,

In ImageNet Experiments, the paper said that it should be trained for 800 epochs:

image

However, in the code, it said that it should be trained for 80 epochs:

image

So there is a big difference……

Besides, I try to re-implement in PyTorch, and the ACC is 7~8 points behind your method. The network architecture and number of parameters is the same as your Darknet results……

Best,
Mu

Does CSP model need to increase training times

Due to the limitation of GPU devices, I only tested the model with epoch = 1, and found that compared with the traditional resnext model, the result of cspresnext model for an epoch is not satisfactory. Is it because of the residual link used that the model needs more time to learn

Could you provide the cfg of CSPPeleeNet ?

I am interested in CSPPeleeNet. since you didn't provide it's cfg yet, I planned to do it myself. but according to your cfg files of resnet50 and CSPresnet50, from resnet50 to CSPresnet50, you did not only add the CSP connections but also change other structure of the net, such as the number of the blocks: [3 4 6 3] vs. [3 3 5 2], and also the number of filter. so could you provide the cfg of CSPPeleeNet? Thanks.

how to edit cfg file.

hello,
I am trying to build a new small & efficient model by concatenating (backbone) csmobilenetv2 + (head) yolov3_tiny to detect objects. Everything going well until ~ 1000 iters (avg loss ~ 8.0) but it gets a nan after that.
Does anyone have any ideas? Thank you.
here is my layer information.
batch = 1, time_steps = 1, train = 0
layer filters size/strd(dil) input output
0 conv 32 3 x 3/ 2 224 x 224 x 3 -> 112 x 112 x 32 0.022 BF
1 conv 16/ 16 3 x 3/ 1 112 x 112 x 32 -> 112 x 112 x 16 0.007 BF
2 route 0 -> 112 x 112 x 32
3 conv 32 1 x 1/ 1 112 x 112 x 32 -> 112 x 112 x 32 0.026 BF
4 conv 32/ 32 3 x 3/ 1 112 x 112 x 32 -> 112 x 112 x 32 0.007 BF
5 conv 16 1 x 1/ 1 112 x 112 x 32 -> 112 x 112 x 16 0.013 BF
6 route 5 1 -> 112 x 112 x 32
7 conv 48 1 x 1/ 1 112 x 112 x 32 -> 112 x 112 x 48 0.039 BF
8 conv 48/ 48 3 x 3/ 2 112 x 112 x 48 -> 56 x 56 x 48 0.003 BF
9 route 7 -> 112 x 112 x 48
10 conv 96/ 48 3 x 3/ 2 112 x 112 x 48 -> 56 x 56 x 96 0.005 BF
11 conv 24 1 x 1/ 1 56 x 56 x 96 -> 56 x 56 x 24 0.014 BF
12 conv 72 1 x 1/ 1 56 x 56 x 24 -> 56 x 56 x 72 0.011 BF
13 conv 144/ 72 3 x 3/ 1 56 x 56 x 72 -> 56 x 56 x 144 0.008 BF
14 conv 24 1 x 1/ 1 56 x 56 x 144 -> 56 x 56 x 24 0.022 BF
15 Shortcut Layer: 11, wt = 0, wn = 0, outputs: 56 x 56 x 24 0.000 BF
16 route 15 8 -> 56 x 56 x 72
17 conv 72 1 x 1/ 1 56 x 56 x 72 -> 56 x 56 x 72 0.033 BF
18 conv 72/ 72 3 x 3/ 2 56 x 56 x 72 -> 28 x 28 x 72 0.001 BF
19 route 17 -> 56 x 56 x 72
20 conv 144/ 72 3 x 3/ 2 56 x 56 x 72 -> 28 x 28 x 144 0.002 BF
21 conv 32 1 x 1/ 1 28 x 28 x 144 -> 28 x 28 x 32 0.007 BF
22 conv 96 1 x 1/ 1 28 x 28 x 32 -> 28 x 28 x 96 0.005 BF
23 conv 192/ 96 3 x 3/ 1 28 x 28 x 96 -> 28 x 28 x 192 0.003 BF
24 conv 32 1 x 1/ 1 28 x 28 x 192 -> 28 x 28 x 32 0.010 BF
25 Shortcut Layer: 21, wt = 0, wn = 0, outputs: 28 x 28 x 32 0.000 BF
26 conv 96 1 x 1/ 1 28 x 28 x 32 -> 28 x 28 x 96 0.005 BF
27 conv 192/ 96 3 x 3/ 1 28 x 28 x 96 -> 28 x 28 x 192 0.003 BF
28 conv 32 1 x 1/ 1 28 x 28 x 192 -> 28 x 28 x 32 0.010 BF
29 Shortcut Layer: 25, wt = 0, wn = 0, outputs: 28 x 28 x 32 0.000 BF
30 conv 96 1 x 1/ 1 28 x 28 x 32 -> 28 x 28 x 96 0.005 BF
31 conv 192/ 96 3 x 3/ 1 28 x 28 x 96 -> 28 x 28 x 192 0.003 BF
32 conv 64 1 x 1/ 1 28 x 28 x 192 -> 28 x 28 x 64 0.019 BF
33 conv 192 1 x 1/ 1 28 x 28 x 64 -> 28 x 28 x 192 0.019 BF
34 conv 384/ 192 3 x 3/ 1 28 x 28 x 192 -> 28 x 28 x 384 0.005 BF
35 conv 64 1 x 1/ 1 28 x 28 x 384 -> 28 x 28 x 64 0.039 BF
36 Shortcut Layer: 32, wt = 0, wn = 0, outputs: 28 x 28 x 64 0.000 BF
37 conv 192 1 x 1/ 1 28 x 28 x 64 -> 28 x 28 x 192 0.019 BF
38 conv 384/ 192 3 x 3/ 1 28 x 28 x 192 -> 28 x 28 x 384 0.005 BF
39 conv 64 1 x 1/ 1 28 x 28 x 384 -> 28 x 28 x 64 0.039 BF
40 Shortcut Layer: 36, wt = 0, wn = 0, outputs: 28 x 28 x 64 0.000 BF
41 conv 192 1 x 1/ 1 28 x 28 x 64 -> 28 x 28 x 192 0.019 BF
42 conv 384/ 192 3 x 3/ 1 28 x 28 x 192 -> 28 x 28 x 384 0.005 BF
43 conv 64 1 x 1/ 1 28 x 28 x 384 -> 28 x 28 x 64 0.039 BF
44 Shortcut Layer: 40, wt = 0, wn = 0, outputs: 28 x 28 x 64 0.000 BF
45 route 44 18 -> 28 x 28 x 136
46 conv 192 1 x 1/ 1 28 x 28 x 136 -> 28 x 28 x 192 0.041 BF
47 conv 192/ 192 3 x 3/ 2 28 x 28 x 192 -> 14 x 14 x 192 0.001 BF
48 route 46 -> 28 x 28 x 192
49 conv 384/ 192 3 x 3/ 2 28 x 28 x 192 -> 14 x 14 x 384 0.001 BF
50 conv 96 1 x 1/ 1 14 x 14 x 384 -> 14 x 14 x 96 0.014 BF
51 conv 288 1 x 1/ 1 14 x 14 x 96 -> 14 x 14 x 288 0.011 BF
52 conv 576/ 288 3 x 3/ 1 14 x 14 x 288 -> 14 x 14 x 576 0.002 BF
53 conv 96 1 x 1/ 1 14 x 14 x 576 -> 14 x 14 x 96 0.022 BF
54 Shortcut Layer: 50, wt = 0, wn = 0, outputs: 14 x 14 x 96 0.000 BF
55 conv 288 1 x 1/ 1 14 x 14 x 96 -> 14 x 14 x 288 0.011 BF
56 conv 576/ 288 3 x 3/ 1 14 x 14 x 288 -> 14 x 14 x 576 0.002 BF
57 conv 96 1 x 1/ 1 14 x 14 x 576 -> 14 x 14 x 96 0.022 BF
58 Shortcut Layer: 54, wt = 0, wn = 0, outputs: 14 x 14 x 96 0.000 BF
59 route 58 47 -> 14 x 14 x 288
60 conv 288 1 x 1/ 1 14 x 14 x 288 -> 14 x 14 x 288 0.033 BF
61 conv 288/ 288 3 x 3/ 2 14 x 14 x 288 -> 7 x 7 x 288 0.000 BF
62 route 60 -> 14 x 14 x 288
63 conv 576/ 288 3 x 3/ 2 14 x 14 x 288 -> 7 x 7 x 576 0.001 BF
64 conv 160 1 x 1/ 1 7 x 7 x 576 -> 7 x 7 x 160 0.009 BF
65 conv 480 1 x 1/ 1 7 x 7 x 160 -> 7 x 7 x 480 0.008 BF
66 conv 960/ 480 3 x 3/ 1 7 x 7 x 480 -> 7 x 7 x 960 0.001 BF
67 conv 160 1 x 1/ 1 7 x 7 x 960 -> 7 x 7 x 160 0.015 BF
68 Shortcut Layer: 64, wt = 0, wn = 0, outputs: 7 x 7 x 160 0.000 BF
69 conv 480 1 x 1/ 1 7 x 7 x 160 -> 7 x 7 x 480 0.008 BF
70 conv 960/ 480 3 x 3/ 1 7 x 7 x 480 -> 7 x 7 x 960 0.001 BF
71 conv 160 1 x 1/ 1 7 x 7 x 960 -> 7 x 7 x 160 0.015 BF
72 Shortcut Layer: 68, wt = 0, wn = 0, outputs: 7 x 7 x 160 0.000 BF
73 conv 480 1 x 1/ 1 7 x 7 x 160 -> 7 x 7 x 480 0.008 BF
74 conv 960/ 480 3 x 3/ 1 7 x 7 x 480 -> 7 x 7 x 960 0.001 BF
75 conv 320 1 x 1/ 1 7 x 7 x 960 -> 7 x 7 x 320 0.030 BF
76 route 75 61 -> 7 x 7 x 608
77 conv 640 1 x 1/ 1 7 x 7 x 608 -> 7 x 7 x 640 0.038 BF
78 conv 27 1 x 1/ 1 7 x 7 x 640 -> 7 x 7 x 27 0.002 BF
79 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
80 route 75 -> 7 x 7 x 320
81 conv 128 1 x 1/ 1 7 x 7 x 320 -> 7 x 7 x 128 0.004 BF
82 upsample 2x 7 x 7 x 128 -> 14 x 14 x 128
83 route 82 60 -> 14 x 14 x 416
84 conv 256 3 x 3/ 1 14 x 14 x 416 -> 14 x 14 x 256 0.376 BF
85 conv 27 1 x 1/ 1 14 x 14 x 256 -> 14 x 14 x 27 0.003 BF
86 yolo

YOLOv4和YOLOv5的性能比較咨詢

Chien-Yao Wang,您好!我是Amusi,我想了解壹下妳測試關於YOLOv4和YOLOv5的性能比較結果。因為我今天剛看到妳在妳CSPNet開源項目中提交的issue,裏面還@了我#32

但我當時沒有及時看到,所以今天才看到這個已關閉的issue。
不知道您有沒有微信(wechat)或者知乎也性,能即時通訊的應用即可。我想快速跟您了解壹下您測試的YOLOv4和YOLOv5!
如有打擾,還請諒解

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.