vdigpku / dynamicdet Goto Github PK

[CVPR 2023] DynamicDet: A Unified Dynamic Architecture for Object Detection

Python 100.00%

coco dynamic-neural-network object-detection yolo

dynamicdet's Introduction

DynamicDet [arXiv]

This repo contains the official implementation of "DynamicDet: A Unified Dynamic Architecture for Object Detection".

Performance

MS COCO

Model	Easy / Hard	Size	FLOPs	FPS	AP^val	AP^test
Dy-YOLOv7	90% / 10%	640	112.4G	110	51.4%	52.1%
	50% / 50%	640	143.2G	96	52.7%	53.3%
	10% / 90%	640	174.0G	85	53.3%	53.8%
	0% / 100%	640	181.7G	83	53.5%	53.9%

Dy-YOLOv7-X	90% / 10%	640	201.7G	98	53.0%	53.3%
	50% / 50%	640	248.9G	78	54.2%	54.4%
	10% / 90%	640	296.1G	65	54.7%	55.0%
	0% / 100%	640	307.9G	64	54.8%	55.0%

Dy-YOLOv7-W6	90% / 10%	1280	384.2G	74	54.9%	55.2%
	50% / 50%	1280	480.8G	58	55.9%	56.1%
	10% / 90%	1280	577.4G	48	56.4%	56.7%
	0% / 100%	1280	601.6G	46	56.5%	56.8%

Table Notes

FPS is measured on the same machine with 1 NVIDIA V100 GPU, with batch_size = 1, no_trace and fp16.
More results can be found on the paper.

Quick Start

Installation

cd DynamicDet
conda install pytorch=1.11 cudatoolkit=11.3 torchvision -c pytorch
pip install -r requirements.txt

Data preparation

Download MS COCO dataset images (train, val, test) and labels.

├── coco
│   ├── train2017.txt
│   ├── val2017.txt
│   ├── test-dev2017.txt
│   ├── images
│   │   ├── train2017
│   │   ├── val2017
│   │   ├── test2017
│   ├── labels
│   │   ├── train2017
│   │   ├── val2017
│   ├── annotations
│   │   ├── instances_val2017.json

Training

Step1: Training cascaded detector

Single GPU training

python train_step1.py --workers 8 --device 0 --batch-size 16 --epochs 300 --img 640 --cfg cfg/dy-yolov7-step1.yaml --weight '' --data data/coco.yaml --hyp hyp/hyp.scratch.p5.yaml --name dy-yolov7-step1

Multiple GPU training (OURS, RECOMMENDED 🚀)

python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train_step1.py --workers 8 --device 0,1,2,3,4,5,6,7 --sync-bn --batch-size 128 --epochs 300 --img 640 --cfg cfg/dy-yolov7-step1.yaml --weight '' --data data/coco.yaml --hyp hyp/hyp.scratch.p5.yaml --name dy-yolov7-step1

Step2: Training adaptive router

python train_step2.py --workers 4 --device 0 --batch-size 1 --epochs 2 --img 640 --adam --cfg cfg/dy-yolov7-step2.yaml --weight runs/train/dy-yolov7-step1/weights/last.pt --data data/coco.yaml --hyp hyp/hyp.finetune.dynamic.adam.yaml --name dy-yolov7-step2

Getting the dynamic thresholds for variable-speed inference

python get_dynamic_thres.py --device 0 --batch-size 1 --img-size 640 --cfg cfg/dy-yolov7-step2.yaml --weight weights/dy-yolov7.pt --data data/coco.yaml --task val

Testing

python test.py --img-size 640 --batch-size 1 --conf 0.001 --iou 0.65 --device 0 --cfg cfg/dy-yolov7-step2.yaml --weight weights/dy-yolov7.pt --data data/coco.yaml --dy-thres <DY_THRESHOLD>

Inference

python detect.py --cfg cfg/dy-yolov7-step2.yaml --weight weights/dy-yolov7.pt --num-classes 80 --source <IMAGE/VIDEO> --device 0 --dy-thres <DY_THRESHOLD>

Citation

If you find this repo useful in your research, please consider citing the following paper:

@inproceedings{lin2023dynamicdet,
  title={DynamicDet: A Unified Dynamic Architecture for Object Detection},
  author={Lin, Zhihao and Wang, Yongtao and Zhang, Jinhe and Chu, Xiaojie},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2023}
}

License

The project is only free for academic research purposes, but needs authorization for commerce. For commerce permission, please contact [email protected].

dynamicdet's People

Contributors

Stargazers

Watchers

Forkers

wstchhwp ai-jie01 jie311 aaaadwa ssz1 aust-hansen yangyahu-1994 raidene1 noticeable lizhe1531 menguangwen0411 jiskcoder wehi-researchcomputing fxyolo vladpiatachenko

dynamicdet's Issues

关于Dy-Faster R-CNN ResNet50和Dy-Mask R-CNN Swin-T配置文件

作者您好，请问关于Dy-Faster R-CNN ResNet50和Dy-Mask R-CNN Swin-T的配置文件在哪里呀，代码中没找到

请问一下作者，您训练一个epoch要多久，我四张A6000在coco一轮要一天！！！！

关于Variable-speed inference的疑问

作者您好：

通过(14)求出来的是分类为“hard”图像的比例，假如0.75，也就是75%的图像被分类为“hard”。那

计算的不应该是25%的分位数嘛，这样计算出来的阈值才有75%的图像的难度分数大于它呀。如果直接将0.75放进percentile函数，计算的好像不对啊

您好，运行程序训练自己的数据出现了以下错误。
运行python train_step2.py 报错，错误如下：
Traceback (most recent call last):
File "train_step2.py", line 551, in
train(hyp, opt, device, tb_writer)
File "train_step2.py", line 160, in train
optimizer.load_state_dict(ckpt['optimizer'])
File "/opt/conda/lib/python3.8/site-packages/torch/optim/optimizer.py", line 141, in load_state_dict
raise ValueError("loaded state dict has a different number of "
ValueError: loaded state dict has a different number of parameter groups
错误发生在加载train_step1.py生成的模型，
相应命令是：python train_step2.py --weight runs/train/exp5/weights/last.pt --name dy-yolov7-step2
定位到代码是发现pg0长度为空，导致静态字典长度不匹配为2，而加载train_step1.py生成的模型的长度为3：
pg0, pg1, pg2 = [], [], [] # optimizer parameter groups
for k, v in model.named_modules():
if 'router' in k:
if hasattr(v, 'bias') and isinstance(v.bias, nn.Parameter):
pg2.append(v.bias) # biases
if isinstance(v, nn.BatchNorm2d):
pg0.append(v.weight) # no decay
elif hasattr(v, 'weight') and isinstance(v.weight, nn.Parameter):
pg1.append(v.weight) # apply decay

if opt.adam:
    optimizer = optim.AdamW(pg1, lr=hyp['lr0'], weight_decay=hyp['weight_decay'], betas=(hyp['momentum'], 0.999))  # adjust beta1 to momentum
else:
    optimizer = optim.SGD(pg1, lr=hyp['lr0'], weight_decay=hyp['weight_decay'], momentum=hyp['momentum'], nesterov=True)
if len(pg0):
    optimizer.add_param_group({'params': pg0, 'weight_decay': 0})  # add pg0 without weight_decay
if len(pg2):
    optimizer.add_param_group({'params': pg2, 'weight_decay': 0})  # add pg2 (biases)
logger.info('Optimizer groups: %g .bias, %g conv.weight, %g other' % (len(pg2), len(pg1), len(pg0)))
del pg0, pg1, pg2

谢谢！

想请教下论文中优化策略的自适应偏差问题

想请教下这个平衡点是如何计算的，在实际应用中我们如何确定中等难度的图像是哪一种？也就是图5中的黄色点位置是如何找到的？期待您的回答。

代码在test和detect时预测的标签中的类别不显示

作者你好，代码在训练时预测正常，在test和detect时预测的label中class用0，1表示，没有显示class
train:

test和detect：

The implementation of Dy-YOLOv7 is developed by the YOLOv7 [45] framework, with two identical detectors. The implementation of dynamic two-stage detectors is developed by the open-source CBNet [24] framework, with two identical backbones and a shared neck and head.

作者您好，这句话什么意思呢？Dy-YOLOv7按照yolov7的框架包括了两个相同的yolov7检测器，动态两阶段检测器是按照CBNet的架构，包括两个相同的backbone和共享的neck和head是吧？那动态的两阶段检测器不符合DynamicDet的架构啊

Can it be extended to anchor-free detectors?

Dear Author:
Hello, first of all, thank you for your excellent work.
You say “This dynamic architecture can be easily adapted to mainstream detectors, e.g., Faster R-CNN and YOLO”, I was wondering if it could be extended to anchor-free detectors such as CornerNet and CenterNet.

Which model path should be provided for dynamic thresholds for variable-speed inference step?

In the dynamic thresholds for variable-speed inference step:

python get_dynamic_thres.py --device 0 --batch-size 1 --img-size 640 --cfg cfg/dy-yolov7-step2.yaml --weight weights/dy-yolov7.pt --data data/coco.yaml --task val

Which weights file is the command pointing to? Intuitively, could it be the best weights from step2 training?

Thanks!

请问有提供权重文件吗？

请问开源项目里有提供权重文件吗？dy-yolov7.pt

为什么奖励和惩罚项是训练损失差的一半？

Why do you use segmentation labels instead of bounding boxes in the COCO dataset?

Hello,
This is indeed a fascinating piece of work, and I appreciate the outstanding contribution you have made to the community. However, I have a question. I noticed that the labels in this repository use segmentation instead of bounding boxes. Could you explain the reason behind this choice? If my dataset is labeled with bounding boxes, would it still be compatible? Moreover, were the performances of other methods compared in the article also trained with segmentation labels? To my knowledge, both v7 and v5 have used bounding box labels.

关于Adaptive router计算的difficulty score

作者您好：

Adaptive router计算的输出为什么不是difficulty score而是1-difficulty score呢？

cuda10.2

您好，请问环境必须是cuda11+吗？

不好意思，您能帮我解决一下这个问题吗？

这个报错信息是哪里出了问题？

Evaluation metrics explanation

Hi Team,

I was wondering how you are computing the metrics for evaluation. I was going through metrics.py file and came across ap_per_class function which seems to be computing the average precision for each class in an image. (FYI - my custom dataset only has 1 class with a lot of objects of that class in a single image)
I wanted to understand what *stats is (the parameter passed in the function) in test.py? And how does it help in being able to assign a predicted class to a ground truth?

Also,
I wanted to know how you are associating a particular predicted class with a ground truth? Is it solely based on the highest iou values? If yes, what if you assign a ground truth to a particular predicted class (and eliminate it from the iteration once it is assigned) and find a higher iou to another predicted class further down the iteration?

Thanks!

关于Fig 7的疑问

尊敬的作者：
您好，Figure 7中，使用三种不同的优化策略训练的router，在以batch=1推理时，他们的推理时间为什么一样啊？还是说，固定住网络的推理时间和难易图像的分割比例，观察AP值的变化？

关于模型FLOPs和Parameters的计算

尊敬的作者：
您好，这里有两个FLOPs和Parameters，应该相信哪一个呢？

Config for custom data

Hello,
My dataset has only one object - I have modified the coco.yaml file accordingly to give data paths, nc = 1 and my class name (which in represented by 0 in the txt files).

Apart from that, do I need to alter any other cfg/hyp files?
I see cfg/dy-yolov7-step1.yaml has an nc variable which I think I would change. Any other changes that I’d need to make?

If I add attention from it. Will it work better?

关于训练adaptive router

尊敬的作者：
您好，请问在训练adaptive router时是如何进行监督的呢？

Whether onnx or trt is supported

Whether onnx or trt is supported？Can you provide the code？

您好，想请问一下原文中公式13的问题

原文中Δ=L1-L2吗？如果是那么公式13的右边L1-L2-Δ应该是等于零？那么梯度下降呢

Continue training model on new data

Hello!
I have a fully trained model (75 epochs) on a certain dataset and I tested it on another dataset. I now want to further train my model on this test dataset. Should I just run:

python train_step1.py --weights 'path/to/best_75_epochs.pt' (Or should the training be continued using last.pt?)

Can I run this on fewer epochs or should it be the same as best.pt? And will this new model be considered to be trained with 75 epochs + new training epochs or just new training epochs on new data?

Also, train_step2.py will then use this new model's last.pt weights, right?

Thanks!

您好，没有在您分享的代码中找到关于Resnet网络的代码，可以提供这一部分的代码吗？

Early stopping or using epoch_xx.pt

Hello,

Is there an early stopping mechanism for DynamicDet? Alternately, could I use intermediate weights in runs/train/weights like epoch_xx.pt for train_step2 and inference?

Thanks.

冒昧问一下大佬这个有什么前景/应用价值吗

首先我可以将大佬的工作理解为：设计了一个可变计算量的网络吗，在easy图像下计算资源占用变少，在hard图片下计算资源占用变多。吗？
如果是这样，那么在资源受限的设备上我一检测hard图片那么计算资源就不够了，在算力资源充足的设备上，既然有那么丰富的算力资源那么强的算力，我使用一个较为复杂的模型，对于hard和easy图片是不是都能以差不多的速度且较快地跑出来。
请问这个的应用价值是什么？是为了在算力充足设备上多扣几毫秒时间吗。还是说比如说一台服务器，为别人远程提供检测任务，本来使用复杂模型我只能跑两个，现在我使用这种计算资源占用的模型我可以跑三个，多跑一个？
望大佬解答，谢谢！

vdigpku / dynamicdet Goto Github PK

dynamicdet's Introduction

DynamicDet [arXiv]

Performance

Quick Start

Installation

Data preparation

Training

Getting the dynamic thresholds for variable-speed inference

Testing

Inference

Citation

License

dynamicdet's People

Contributors

Stargazers

Watchers

Forkers

dynamicdet's Issues

Recommend Projects

Recommend Topics

Recommend Org