Giter Club home page Giter Club logo

efficientdet-pytorch's Introduction

Hi,很高兴遇见你 👋

efficientdet-pytorch's People

Contributors

bubbliiiing avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

efficientdet-pytorch's Issues

train.py

Fail To Load Key: ['classifier.header.pointwise_conv.conv.weight', 'classifier.header.pointwise_conv.conv.bias'] ……
Fail To Load Key num: 2

温馨提示,head部分没有载入是正常现象,Backbone部分没有载入是错误的。
The expanded size of the tensor (46917) must match the existing size (49104) at non-singleton dimension 1. Target sizes: [1, 46917, 4]. Tensor sizes: [1, 49104, 4]
Error occurs, No graph saved
在运行时出现了这个错误

get_dr_txt.py出现问题

RuntimeError: Error(s) in loading state_dict for EfficientDetBackbone:
size mismatch for classifier.header.pointwise_conv.conv.weight: copying a param with shape torch.Size([396, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([810, 64, 1, 1]).
size mismatch for classifier.header.pointwise_conv.conv.bias: copying a param with shape torch.Size([396]) from checkpoint, the shape in current model is torch.Size([810]).
为什么会出现上面的维度不匹配的问题

请教预测的时候出现了bug

File "predict.py", line 17, in
r_image = efficientdet.detect_image(image)
File "/home/404/efficientdet-pytorch-master/efficientdet.py", line 109, in detect_image
detection = torch.cat([regression,classification],axis=-1)
TypeError: cat() got an unexpected keyword argument 'axis'
(yp) [root@localhost efficientdet-pytorch-master]#

多GPU训练

请问这个训练支持多GPU训练吗,在哪设置呢?

关于bifpn,多次执行bifpn次数的问题

从您的B站课堂过来的!大佬讲的真棒!!关于bifpn这里有一个地方没明白。论文中提到bifpn这个结构是需要使用多次的,您的代码里也提到了第一次bifpn结束以后会存在p3_out、p4_out.....p7_out它们会返回成为新的输入进行第二次bifpn操作。想麻烦问您一下代码中哪里对这个bifpn总共操作次数的值进行了定义?

用predict.py计算fps时出现的问题

Traceback (most recent call last):
File "e:/deeplearning/efficientdet-pytorch-master/efficientdet-pytorch-master/predict.py", line 120, in
tact_time = efficientdet.get_FPS(img, test_interval)
File "e:\deeplearning\efficientdet-pytorch-master\efficientdet-pytorch-master\efficientdet.py", line 237, in get_FPS
image_shape, self.letterbox_image, conf_thres = self.confidence, nms_thres = self.nms_iou)
TypeError: non_max_suppression() got multiple values for argument 'conf_thres'

运行train.py出现问题

runfile('E:/A/efficientdet-pytorch-master/train.py', wdir='E:/A/efficientdet-pytorch-master')
Reloaded modules: nets, nets.efficientdet, utils, utils.anchors, nets.efficientnet, nets.layers, nets.efficientdet_training
Traceback (most recent call last):

File "D:\software\Anaconda\envs\torch1.2\lib\site-packages\torch\utils\tensorboard_init_.py", line 2, in
from tensorboard.summary.writer.record_writer import RecordWriter # noqa F401

ModuleNotFoundError: No module named 'tensorboard.summary'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "E:\A\efficientdet-pytorch-master\train.py", line 16, in
from utils.callbacks import LossHistory

File "E:\A\efficientdet-pytorch-master\utils\callbacks.py", line 9, in
from torch.utils.tensorboard import SummaryWriter

File "D:\software\Anaconda\envs\torch1.2\lib\site-packages\torch\utils\tensorboard_init_.py", line 4, in
raise ImportError('TensorBoard logging requires TensorBoard with Python summary writer installed. '

ImportError: TensorBoard logging requires TensorBoard with Python summary writer installed. This should be available in 1.14 or above.
image

training warning

Epoch 3/25: 0%| | 0/204 [00:00<?, ?it/s<class 'dict'>]D:\Model\efficientDet\04\efficientdet-pytorch\Utils\dataloader.py:130: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
bboxes = np.array(bboxes)
D:\Model\efficientDet\04\efficientdet-pytorch\Utils\dataloader.py:130: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
bboxes = np.array(bboxes)

self._root = parser._parse_whole(source) UnicodeDecodeError: 'gbk' codec can't decode byte 0xac in position 129: illegal multibyte sequence

If you encounter the above problems, please follow the following additional coding format to solve.

in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id), encoding='UTF_8')

trouble

in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id))

method

in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id), encoding='UTF_8')

TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect

/project/nets/layers.py:323: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
extra_h = (math.ceil(w / self.stride[1]) - 1) * self.stride[1] - w + self.kernel_size[1]
/project/nets/layers.py:324: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
extra_v = (math.ceil(h / self.stride[0]) - 1) * self.stride[0] - h + self.kernel_size[0]
/project/nets/layers.py:357: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
extra_h = (math.ceil(w / self.stride[1]) - 1) * self.stride[1] - w + self.kernel_size[1]
/project/nets/layers.py:358: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
extra_v = (math.ceil(h / self.stride[0]) - 1) * self.stride[0] - h + self.kernel_size[0]
/project/utils/anchors.py:25: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if image_shape[1] % stride != 0:
/project/utils/anchors.py:30: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
x = np.arange(stride / 2, image_shape[1], stride)
/project/utils/anchors.py:30: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
x = np.arange(stride / 2, image_shape[1], stride)
/project/utils/anchors.py:31: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
y = np.arange(stride / 2, image_shape[0], stride)
/project/utils/anchors.py:31: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
y = np.arange(stride / 2, image_shape[0], stride)
/project/utils/anchors.py:49: TracerWarning: torch.from_numpy results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
anchor_boxes = torch.from_numpy(anchor_boxes).to(image.device)

predict遇到问题

Traceback (most recent call last):
File "E:\yk\Code\efficientdet-pytorch\predict.py", line 77, in
r_image = efficientdet.detect_image(image, crop = crop, count=count)
File "E:\yk\Code\efficientdet-pytorch\efficientdet.py", line 216, in detect_image
draw.rectangle([left + i, top + i, right - i, bottom - i], outline=self.colors[c])
File "D:\Anaconda\envs\objectbox\lib\site-packages\PIL\ImageDraw.py", line 296, in rectangle
self.draw.draw_rectangle(xy, ink, 0, width)
ValueError: x1 must be greater than or equal to x0

作者大大好,我在使用d3预训练权重进行预测时,出现了这个错误,不知道是什么原因

训练预测出现问题

你好,我训练完50个epoch之后重新加载测试模型,出现了以下错误信息,是我没有改fc的输出导致的吗?谢谢
Traceback (most recent call last):
File "/home/yueyu/efficientdet-pytorch/predict.py", line 7, in
efficientdet = EfficientDet()
File "/home/yueyu/efficientdet-pytorch/efficientdet.py", line 53, in init
self.generate()
File "/home/yueyu/efficientdet-pytorch/efficientdet.py", line 74, in generate
self.net.load_state_dict(state_dict)
File "/home/yueyu/miniconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 839, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for EfficientDetBackbone:
size mismatch for classifier.header.pointwise_conv.conv.weight: copying a param with shape torch.Size([180, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([36, 64, 1, 1]).
size mismatch for classifier.header.pointwise_conv.conv.bias: copying a param with shape torch.Size([180]) from checkpoint, the shape in current model is torch.Size([36]).

下载预训练权值出现问题:无效哈希值

b导好,我在下载预训练权值时给我报错如下,这是为什么呢?
RuntimeError: invalid hash value (expected "b0", got "73f3a3d3c70508a1dfc1fcb58f8ba0edb1a5aaf2f0aaa2ce4dcd34b18b1a97df")

如何将训练好的.pth权重和模型转换为 .onnx or .pb通用模型文件

非常感谢博主视频和博文,已经跟了博主学习了一段时间,但是无法将训练好的权重转换为onnx或pb模型,建议能否出一期视频或博文,专门讲如何把训练好的权重与模型的各种格式文件(如.pt .pth.h5)转换为.onnx或.pb后缀的通用模型文件,以便于其他平台部署和推理。这有利于将视觉深度学习在工业环境中得到应用,非常感谢。

训练自己的数据,mAP很低

作者您好,感谢你的工作。
我使用efficientdet-d3对其他数据集进行训练,效果比较差,可以请您帮忙分析一下吗?谢谢!
数据集是铝型材表面缺陷检测数据
前30个epoch冻结主干,之后解冻训练,训练到val_loss不再降低。
map结果如下:
`Get map.

2.78% = 不导电 AP || score_threhold=0.5 : F1=0.00 ; Recall=0.00% ; Precision=0.00%

0.00% = 喷流 AP || score_threhold=0.5 : F1=0.00 ; Recall=0.00% ; Precision=0.00%

0.00% = 擦花 AP || score_threhold=0.5 : F1=0.00% ; Recall=0.00% ; Precision=0.00%

38.01% = 杂色 AP || score_threhold=0.5 : F1=0.11 ; Recall=5.56% ; Precision=100.00%

13.00% = 桔皮 AP || score_threhold=0.5 : F1=0.00 ; Recall=0.00% ; Precision=0.00%

0.00% = 漆泡 AP || score_threhold=0.5 : F1=0.00% ; Recall=0.00% ; Precision=0.00%

0.00% = 漏底 AP || score_threhold=0.5 : F1=0.00 ; Recall=0.00% ; Precision=0.00%

5.01% = 脏点 AP || score_threhold=0.5 : F1=0.14 ; Recall=10.84% ; Precision=21.43%

0.00% = 角位漏底 AP || score_threhold=0.5 : F1=0.00% ; Recall=0.00% ; Precision=0.00%

0.00% = 起坑 AP || score_threhold=0.5 : F1=0.00 ; Recall=0.00% ; Precision=0.00%

mAP = 5.88%

Get map done.`

每个epoch的val_loss如下:
692.5821533203125 18.020043546503242 6.0580932082551895 2.9250449375672773 1.7247823729659573 1.1773547078623916 0.8997215122887583 0.7432047751816836 0.6556733747323354 0.5967527682130987 0.5552684529261156 0.5247120744351185 0.5120401086680817 0.490136006564805 0.48692420892643207 0.4789598101016247 0.46626631509174 0.47130427938519104 0.46737760531179834 0.4521835113339352 0.4599695607568278 0.45360165202256403 0.44911260993191693 0.4493762557253693 0.4430947389566537 0.44133371464682347 0.44138025108611945 0.44033303834272153 0.4294479764772184 0.4288688725368543 0.43481780243898505 0.4259181917825742 0.4300796524691048 0.41619802549926205 0.41009346857222156 0.4145459183561268 0.4220570447618392 0.4012312131530758 0.3930108847735978 0.4246373758998825 0.38846801557758853 0.39046983073340424 0.5298791383462611 0.3835990623251271 0.424211601240199 0.40973395342702296 0.374117885035143 0.3753899414537113 0.3762250279924318 0.3619030268668239 0.4167037764147146 0.37196493557473614 0.3779459266798265 0.3777198547691996 0.3685086570235331 0.3662413968635139 0.3724715964062445 0.37364781386594276 0.3646606824068881 0.40380949271259026 0.36690836080085876 0.39359222296903384 0.35407807564001476 0.35932715515147395 0.35652801939355794 0.3581762860126015 0.3570717303777364 0.3549506452712995 0.3696301862208256 0.36184008566857273 0.35130936827566195 0.35368453527786836 0.36518700056667647 0.34808981838399794 0.3540204591326304 0.35394072493732864 0.3597586819651856 0.35117012387447394 0.35453211001829427 0.338129321863847 0.354188849065286 0.3480956403733189 0.3560672045421244 0.3494431595665528 0.3579848694489963 0.3562058041344828 0.3504434914620065 0.36181518032368437 0.3520742502730729 0.3408385811568196 0.3392267982239154 0.34833704064419463 0.3375129512330489 0.34490444361051514 0.3474811746168937 0.3642430662997623 0.3400071593029286 0.3533157893359216 0.34791290949084863 0.35537530932186256 0.3504680647000448 0.3470999870949717 0.3505480893011858 0.351605375561474 0.35297540341740224 0.3379963515743391 0.34117161613235725 0.3530546065364311 0.35188829584686615 0.35485441400322004 0.3438295838492575 0.3458844947058763 0.35485429858872247 0.3565744514674393 0.34367825498165033 0.34764408359109467 0.35074018681449676 0.3437748175225596 0.34253188953804437 0.3441715170976831 0.3461703913870142 0.34832563582084963 0.3480878883222146 0.34780260700899274 0.3481335105068648 0.3435267209172694 0.34988239888491024 0.35219536335277024 0.3490558216876503 0.34662854827162043 0.3428226285961582 0.3569837624985558 0.34416547353699134 0.34747738218796786 0.34722422989113116 0.34134776695673147 0.343578678819893 0.3511959823654659 0.3519623815012512 0.34963406944897635 0.3476591583118955 0.34318768255301374 0.3484093218819419 0.35494244101443395 0.3509057753010472 0.3456782892957997 0.3371015375118647 0.3482280739708178 0.3487955643447922 0.3454236375140165 0.35292010598663076 0.3519064793780224 0.3401252330461545 0.3439494139278558 0.34353317512171483 0.34736122029708394 0.3405051428491055 0.34928369027242734 0.34589640301332547 0.34704446495135327 0.348259352842596 0.34758604331803855 0.34305048622746964 0.3531607194428346 0.3395370377311066 0.3502034348950012 0.3446665857988062 0.3422466347605657 0.34934172028703475 0.35288102302088664 0.35932373247151056 0.3504922179255023 0.3531327823093578 0.3520925745868416 0.3523084268029501 0.34706903154503055 0.35040349913622015 0.3543376238432838 0.3538556429766007 0.34093494554842585 0.3473847135901451 0.3451451262764966 0.34536995963930195 0.3503570426533471 0.34914408012557385 0.3532811352399303 0.34506166659629167 0.3482327019489968 0.3509918149393886 0.3524799474806928 0.35067986360570386 0.3517061372134668 0.3485593108543709 0.3451726289827432 0.34738497355424647 0.34392267045801256 0.3420782624674377 0.33875321841506817 0.347360303236255 0.3521433267186382 0.3485356454473378 0.34775876684753754 0.3512924103094126 0.3482155179632689 0.3382312229264583 0.35628125399573524 0.34736081468525215 0.3492826893369653 0.3421760888964827 0.3456490655610366 0.34405294005105747 0.34908889931862924 0.34774825335549775 0.35118296286508216 0.3519998987886443 0.3402652802474018 0.34220543765087624 0.34587571233399766

标签框筛选问题

您好,想问一下代码在加载标签时有根据宽高比筛选标签框的操作吗,我发现训练出来的模型对细长直的目标检测效果比较差,有没有可能是在训练阶段这类极限宽高比的标签被筛掉了

voc map

请问用efficientDet D0训练voc数据集的map是多少呢,我训练时的最高精度是三十多,我觉得不大行

bug

File "predict.py", line 17, in
r_image = efficientdet.detect_image(image)
File "/home/404/efficientdet-pytorch-master/efficientdet.py", line 109, in detect_image
detection = torch.cat([regression,classification],axis=-1)
TypeError: cat() got an unexpected keyword argument 'axis'
(yp) [root@localhost efficientdet-pytorch-master]#

训练问题

博主您好。
训练过程中,会突然报错,错误代码如下所示:
E:\py_file\efficientdet-pytorch-master\venv\Scripts\python.exe E:/py_file/efficientdet-pytorch-master/train_1.py
Loading weights into state dict...
Finished!
Start Train
Epoch 1/50: 26%|██▌ | 428/1675 [09:09<26:39, 1.28s/it, Conf Loss=994, Regression Loss=0.0386, lr=0.001]
Traceback (most recent call last):
File "E:/py_file/efficientdet-pytorch-master/train_1.py", line 214, in
val_loss = fit_one_epoch(net, efficient_loss, epoch, epoch_size, epoch_size_val, gen, gen_val, Freeze_Epoch, Cuda)
File "E:/py_file/efficientdet-pytorch-master/train_1.py", line 44, in fit_one_epoch
targets = [torch.from_numpy(ann).type(torch.FloatTensor).cuda() for ann in targets]
File "E:/py_file/efficientdet-pytorch-master/train_1.py", line 44, in
targets = [torch.from_numpy(ann).type(torch.FloatTensor).cuda() for ann in targets]
TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, int64, int32, int16, int8, uint8, and bool.

查询之后没有找到解决办法,希望能得到您的帮助,谢谢

训练日志 pr输出 指定训练gpu

因为我在服务器上运行这个程序,但是没有看到设定日志保存的地方 只看到了权重文件的保存
计算map可以指定阈值(如0.5)下的recall precision
最后 没有发现可以指定gpu的地方 负责一旦运行程序 所有gpu都得工作 影响别人的工作,请大神指点

RuntimeError: CUDA error: device-side assert triggered

使用自己的数据集进行训练时出现了这个错误,请问应该怎么解决呢?
./aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [1,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [2,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [3,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [4,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [5,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [6,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [7,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [8,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [9,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [10,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [11,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [12,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [13,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [14,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [15,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [16,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [17,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [18,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [19,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [20,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [21,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [22,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [23,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [24,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [25,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [26,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [27,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [28,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [29,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
Traceback (most recent call last):
File "/home/jxt/.conda/envs/efficient/lib/python3.8/multiprocessing/queues.py", line 245, in _feed
send_bytes(obj)
File "/home/jxt/.conda/envs/efficient/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/home/jxt/.conda/envs/efficient/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
self._send(header + buf)
File "/home/jxt/.conda/envs/efficient/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
Traceback (most recent call last):
File "/mnt/disk1/data0/jxt/efficientdet/train.py", line 504, in
fit_one_epoch(model_train, model, focal_loss, loss_history, eval_callback, optimizer, epoch,
File "/mnt/disk1/data0/jxt/efficientdet/utils/utils_fit.py", line 37, in fit_one_epoch
loss_value, _, _ = focal_loss(classification, regression, anchors, targets, cuda = cuda)
File "/home/jxt/.conda/envs/efficient/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/disk1/data0/jxt/efficientdet/nets/efficientdet_training.py", line 210, in forward
if positive_indices.sum() > 0:
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

和其他检测模型对比问题

这边我跑的b0网络和ssd效果对比,比ssd低了5个点,这个效果正常吗,其他朋友跑这个模型效果怎么样呢

数据转换

在annotations下的xml文件全部都转换到train的数据了,剩余的txt文档全是空的啊

用MAP出现bug

您好,非常感谢分享,我再执行get_dr_txt.py时,发现生成得detection下得txt文件全空,查找错误发现,执行test策略时,执行EfficientNet的forward时(位置在代码目录$./nets/efficientdet.py),第413行x = self.model._bn0(x),也即self._bn0 = nn.BatchNorm2d(num_features=out_channels, momentum=bn_mom, eps=bn_eps)函数时, 上一步不同的输出数值,经过bn0后结果都变为相同的输出,不管接受的是什么输入,只输出'backbone_net.model._bn0.bias'的值,并且weight并不为0,查看上一步的结果,至于形状
test策略时:
[16, 3, 1024, 1024] input
[16, 48, 512, 512] 经过x = self.model._conv_stem(x)
train策略时:
[2, 3, 1024, 1024] input
[2, 48, 512, 512] 经过x = self.model._conv_stem(x)

test策略调用的模型是训练的第40代(phi=4),训练误差和验证误差均为0.0002。

试着调整test时的送入形状,我分别截取输出,使map调用输入的形状和训练时的,两者形状完全一致,还是有以上错误;

截取了训练时的数据,结果证明是完全没问题的,

查看训练和map输入的数据,进行对比发现,并无明显差异,

非常期待您的回复,谢谢

map太低

训练d1去检测织物瑕疵(图片大小1280x1080,标签大小50x900,有些更小),训练出来map都是0.04这样的

loss问题

训练的时候,验证loss前几个值能达到几十万,虽然100epoch之后loss也只有1.多,但是map是0,数据集格式上我用在大佬你yolov3上能用,但是在这上面map一直是0,请问还可能是什么原因呢?模型用的d1

get_dr_txt.py 报错

RuntimeError: Error(s) in loading state_dict for EfficientDetBackbone:
size mismatch for classifier.header.pointwise_conv.conv.weight: copying a param with shape torch.Size([45, 112, 1, 1]) from checkpoint, the shape in current model is torch.Size([189, 112, 1, 1]).
size mismatch for classifier.header.pointwise_conv.conv.bias: copying a param with shape torch.Size([45]) from checkpoint, the shape in current model is torch.Size([189]).
用的phi=2 pth文件也改了,为什么维度对不上呢

博主您好,我才使用您的efficientDet代码时候,分别用了两个不同的数据集,但是效果却相差较大。

  这两个数据集的图像尺寸均在1000像素左右
   效果差的那个数据集,存在这样的情况:共有10类目标,我的训练策略是使用efficientd-4作为初始权重。所测试的结果其map仅能达到75左右,但是用frcnn ssd retinanet等均能达到85-89左右。但是当我改用efficient d-3作为初始权重的时候,在batchsize=4的设置下,经过100epoch所测试的的各类ap只有个位数大小,实在好奇怪。
  但是另一个数据集,效果很好可以达到非常理想的map,与frcnn ssd retinanet等均相近甚至更高。
 希望能够得到您的帮助!非常感谢!

跑get_map.py

为什么DO跑出来准确率95%,召回率只有29%,mAP也只有34.70%

voc2007test mAP只有30多(D0)

请问博主使用VOC2007trainval训练时使用了多少个epoch,我简单训练了65个(在D0权重的基础上),效果太差。是我训练时间太短还是有什么其他问题

为什么MAP的结果总是0

你好,我在用Efficient-Det训练Text-COCO做文字检测,bounding box框的位置是正确的,之后我按照说明运行get_map.py 文件来求map和precision,recall。
但是得到的结果总是0. 绝大部分结果被认定为false positive, 请问这是数据的问题还是get_map代码的问题?

bifpn中加权聚合时(简单的注意力机制),权重参数不更新问题

作者您好,我尝试将您这个代码中bifpn中的加权聚合方法移植到别的目标检测框架中看看能不能有作用,我修改后代码成功跑起来了,但在查看训练好的模型字典时,发现权重w初始化值是【1,1】训练好的模型,w仍然是【1,1】,希望大神能帮我看下我的代码是不是哪出现了问题,从而导致该参数完全不更新。
image
image

输入图片

不同的模型 比如d0和d7 不能设置输入训练图片大小都是640*640吗?看代码只能逐渐增加尺寸,这是为什么呢?

更改网络问题

大佬,我把这个backbone换成了yolov5的backbone,训练之后loss很低,但是map测试很差,大佬能给点意见参考参考是为什么吗?

你好,在获得训练用的2007_train.txt、2007_val.txt存在一些问题

你好,在使用voc_annotation.py生成用于训练用的2007_train.txt、2007_val.txt时,如果数据集中存在negative image set,即影像无对应的地物目标时,会报错xml.etree.ElementTree.ParseError: no element found: line 1, column 0,大概是可能因为图像中没有用于训练的目标,那么在生成的时候要把这些图片给跳过吗?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.