Giter Club home page Giter Club logo

yolov3-network-slimming's Introduction

yolov3-network-slimming

LICENSE

Learning Efficient Convolutional Networks Through Network Slimming (ICCV 2017)应用在yolov3和yolov2上

环境

pytorch 0.41

window 10

如何使用

1.对原始weights文件进行稀疏化训练

python sparsity_train.py -sr --s 0.0001 --image_folder coco.data --cfg yolov3.cfg --weights yolov3.weights

2.剪枝

python prune.py --cfg yolov3.cfg --weights checkpoints/yolov3_sparsity_100.weights --percent 0.3

3.对剪枝后的weights进行微调

python sparsity_train.py --image_folder coco.data --cfg prune_yolov3.cfg --weights prune_yolov3.weights

关于new_prune.py

new_prune更新了算法,现在可以确保不会有某一层被减为0的情况发生,参考RETHINKING THE SMALLER-NORM-LESSINFORMATIVE ASSUMPTION IN CHANNEL PRUNING OF CONVOLUTION LAYERS(ICLR 2018)对剪枝后bn层β系数进行了保留

待完成

coco测试

yolov3-network-slimming's People

Contributors

talebolano avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yolov3-network-slimming's Issues

always RuntimeError: CUDA error: out of memory

one GTX1070 8G
64G RAM

width= 608
height= 608
random=1
all settings are default, i do not change anything.

i have tried those combinations, all failed!!!!
RuntimeError: CUDA error: out of memory
batch=30
subdivisions=15

batch=15
subdivisions=5

batch=10
subdivisions=5

Your pruned weights are not working.

I have successfully run your program. However, object detection cannot be done with pruned weight values.
Is there any work you achieved with these weight values ?

Best Regards

关于第一步稀疏化训练

你好,请问你第一步稀疏化训练是直接在yolov3.weights的基础上finetune吗?这样稀疏化训练完后BN层的会有更多的y更接近0吗?因为我看原文都是从头开始训的,所以想问一下,谢谢

稀疏训练自己的网络

师兄,你好。
我想了解一下,我自己基于yolov3进行改装的网络,可以在你这个工程里进行稀疏训练吗

尝试多GPU训练后loss上涨

我用DataParallel来进行多GPU训练的时候,只有total loss,recal和precision看起来是在正常波动的, 其他的loss(x, y, w, h, conf, cls)看起来都在一直累加似的上涨。

[Epoch 0/2000, Batch 1/5864] [Losses: x 0.591932, y 0.528098, w 0.746021, h 0.659080, conf 1.570114, cls 4.276174, total 2.265707, recall: 0.69901, precision: 0.04759]
[Epoch 0/2000, Batch 2/5864] [Losses: x 0.857584, y 0.785161, w 1.098811, h 0.966139, conf 2.190402, cls 6.412073, total 1.969376, recall: 0.73333, precision: 0.04094]
[Epoch 0/2000, Batch 3/5864] [Losses: x 1.102311, y 1.012907, w 1.538904, h 1.343006, conf 2.772152, cls 8.552397, total 2.005754, recall: 0.72408, precision: 0.04118]
[Epoch 0/2000, Batch 4/5864] [Losses: x 1.368159, y 1.248976, w 1.888377, h 1.623722, conf 3.388307, cls 10.692499, total 1.944181, recall: 0.78692, precision: 0.05367]
[Epoch 0/2000, Batch 5/5864] [Losses: x 1.633099, y 1.499638, w 2.118895, h 1.983082, conf 4.027135, cls 12.831954, total 1.941882, recall: 0.69855, precision: 0.03878]
[Epoch 0/2000, Batch 6/5864] [Losses: x 1.869458, y 1.737572, w 2.342767, h 2.216016, conf 4.740722, cls 14.973027, total 1.892880, recall: 0.72807, precision: 0.04427]
[Epoch 0/2000, Batch 7/5864] [Losses: x 2.126234, y 1.999815, w 2.588304, h 2.463250, conf 5.330129, cls 17.117223, total 1.872695, recall: 0.69712, precision: 0.04138]
[Epoch 0/2000, Batch 8/5864] [Losses: x 2.410720, y 2.238643, w 2.903877, h 2.808697, conf 6.035615, cls 19.263997, total 2.018299, recall: 0.63497, precision: 0.03782]
[Epoch 0/2000, Batch 9/5864] [Losses: x 2.672090, y 2.490939, w 3.304709, h 3.146332, conf 6.816026, cls 21.392921, total 2.080733, recall: 0.77885, precision: 0.05805]
[Epoch 0/2000, Batch 10/5864] [Losses: x 2.907513, y 2.714998, w 3.519198, h 3.365142, conf 7.449289, cls 23.531282, total 1.832202, recall: 0.76154, precision: 0.04329]
[Epoch 0/2000, Batch 11/5864] [Losses: x 3.195225, y 2.953113, w 3.833658, h 3.722046, conf 8.108175, cls 25.690390, total 2.007593, recall: 0.65499, precision: 0.04191]
[Epoch 0/2000, Batch 12/5864] [Losses: x 3.477309, y 3.211058, w 4.125373, h 3.966005, conf 8.913965, cls 27.824598, total 2.007851, recall: 0.76720, precision: 0.04063]
[Epoch 0/2000, Batch 13/5864] [Losses: x 3.751552, y 3.462639, w 4.544926, h 4.246683, conf 9.675089, cls 29.995962, total 2.079271, recall: 0.66534, precision: 0.05127]
[Epoch 0/2000, Batch 14/5864] [Losses: x 3.988789, y 3.692157, w 4.929888, h 4.622669, conf 10.249074, cls 32.140991, total 1.973358, recall: 0.73635, precision: 0.05072]
[Epoch 0/2000, Batch 15/5864] [Losses: x 4.210305, y 3.945491, w 5.236189, h 4.965648, conf 10.837334, cls 34.282791, total 1.927095, recall: 0.71244, precision: 0.04896]

我添加了一个多GPU判断的代码:

if cuda:
    if torch.cuda.device_count() > 1:
        model = nn.DataParallel(model, device_ids=list(range(torch.cuda.device_count())))
        optimizer = nn.DataParallel(model, device_ids=list(range(torch.cuda.device_count())))
    model = model.cuda()
    optimizer = optimizer.cuda()

然后我把loss和optimizer更新的代码改成了:

optimizer.module.zero_grad()
loss = model(imgs, targets)
#loss.sum().backward()
loss.mean().backward()
optimizer.module.step()

所有对应model的地方我也全都改成了model.module。求大佬帮忙看看。

模型问题

请问
原始weights文件进行稀疏化训练后,
剪枝后,
剪枝后的weights进行微调后,
模型还是cfg/weights这种格式吗?还是被保存成了pyrtoch的格式?

关于稀疏化训练的问题

image

你好
在训练几步后 LOSS就会变得非常大,请问这有影响吗
是否需要接着训练下去,或者改什么参数吗
谢谢

剪枝问题

你好呀:

load network
done!
load weightsfile
done!

Pre-processing...
layer index: 4 total channel: 32 remaining channel: 32
layer index: 8 total channel: 64 remaining channel: 64
layer index: 12 total channel: 32 remaining channel: 31
layer index: 16 total channel: 64 remaining channel: 64
layer index: 22 total channel: 128 remaining channel: 128
layer index: 26 total channel: 64 remaining channel: 53
layer index: 30 total channel: 128 remaining channel: 128
layer index: 36 total channel: 64 remaining channel: 63
layer index: 40 total channel: 128 remaining channel: 128
layer index: 46 total channel: 256 remaining channel: 256
layer index: 50 total channel: 128 remaining channel: 86
layer index: 54 total channel: 256 remaining channel: 256
layer index: 60 total channel: 128 remaining channel: 125
layer index: 64 total channel: 256 remaining channel: 256
layer index: 70 total channel: 128 remaining channel: 126
layer index: 74 total channel: 256 remaining channel: 256
layer index: 80 total channel: 128 remaining channel: 128
layer index: 84 total channel: 256 remaining channel: 256
layer index: 90 total channel: 128 remaining channel: 128
layer index: 94 total channel: 256 remaining channel: 256
layer index: 100 total channel: 128 remaining channel: 126
layer index: 104 total channel: 256 remaining channel: 256
layer index: 110 total channel: 128 remaining channel: 120
layer index: 114 total channel: 256 remaining channel: 256
layer index: 120 total channel: 128 remaining channel: 125
layer index: 124 total channel: 256 remaining channel: 256
layer index: 130 total channel: 512 remaining channel: 512
layer index: 134 total channel: 256 remaining channel: 256
layer index: 138 total channel: 512 remaining channel: 512
layer index: 144 total channel: 256 remaining channel: 249
layer index: 148 total channel: 512 remaining channel: 512
layer index: 154 total channel: 256 remaining channel: 244
layer index: 158 total channel: 512 remaining channel: 512
layer index: 164 total channel: 256 remaining channel: 239
layer index: 168 total channel: 512 remaining channel: 512
layer index: 174 total channel: 256 remaining channel: 249
layer index: 178 total channel: 512 remaining channel: 512
layer index: 184 total channel: 256 remaining channel: 240
layer index: 188 total channel: 512 remaining channel: 512
layer index: 194 total channel: 256 remaining channel: 256
layer index: 198 total channel: 512 remaining channel: 512
layer index: 204 total channel: 256 remaining channel: 235
layer index: 208 total channel: 512 remaining channel: 512
layer index: 214 total channel: 1024 remaining channel: 1024
layer index: 218 total channel: 512 remaining channel: 464
layer index: 222 total channel: 1024 remaining channel: 1024
layer index: 228 total channel: 512 remaining channel: 466
layer index: 232 total channel: 1024 remaining channel: 1024
layer index: 238 total channel: 512 remaining channel: 473
layer index: 242 total channel: 1024 remaining channel: 1024
layer index: 248 total channel: 512 remaining channel: 465
layer index: 252 total channel: 1024 remaining channel: 1024
layer index: 258 total channel: 512 remaining channel: 512
layer index: 262 total channel: 1024 remaining channel: 1024
layer index: 266 total channel: 512 remaining channel: 512
layer index: 270 total channel: 1024 remaining channel: 1024
layer index: 274 total channel: 512 remaining channel: 512
layer index: 278 total channel: 1024 remaining channel: 1024
layer index: 291 total channel: 256 remaining channel: 1
layer index: 299 total channel: 256 remaining channel: 0
layer index: 303 total channel: 512 remaining channel: 0
layer index: 307 total channel: 256 remaining channel: 0
layer index: 311 total channel: 512 remaining channel: 0
layer index: 315 total channel: 256 remaining channel: 0
layer index: 319 total channel: 512 remaining channel: 22
layer index: 332 total channel: 128 remaining channel: 7
layer index: 340 total channel: 128 remaining channel: 1
layer index: 344 total channel: 256 remaining channel: 1
layer index: 348 total channel: 128 remaining channel: 3
layer index: 352 total channel: 256 remaining channel: 1
layer index: 356 total channel: 128 remaining channel: 1
layer index: 360 total channel: 256 remaining channel: 7
Pre-processing Successful!

save pruned cfg file in prune_yolov3_20.cfg
Traceback (most recent call last):
File "prune.py", line 91, in
newmodel = Darknet(prunecfg)
File "/home2/lc/yolov3-network-slimming/yolomodel.py", line 324, in init
self.net_info, self.module_list = create_modules(self.blocks)
File "/home2/lc/yolov3-network-slimming/yolomodel.py", line 232, in create_modules
conv = nn.Conv2d(prev_filters, filters, kernel_size, stride, pad, bias=bias)
File "/home/user1/.local/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 297, in init
False, _pair(0), groups, bias)
File "/home/user1/.local/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 38, in init
self.reset_parameters()
File "/home/user1/.local/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 44, in reset_parameters
stdv = 1. / math.sqrt(n)
ZeroDivisionError: float division by zero

为什么剪枝后会出现channel为0的情况?

coco数据提取label的格式问题

请问,对于每张图片中的object,对应的标签文件“label x y w h”分别含义是什么?

  1. 1-80的label值、检测目标框中心点x、检测目标中心点y,检测目标框宽度、检测目标狂高度
  2. 1-80的label值、检测目标框x/原图宽度、检测目标框y/原图高度、检测目标框宽度/原图宽度、检测目标框/原图高度。

是哪一种呢?

division by zero(稀疏化训练时)

在稀疏化训练的时候发生了一下错误
zweistein@zweistein-System-Product-Name:~/桌面/yolov3-network-slimming$ python sparsity_train.py -sr --s 0.1 --image_folder cfg/coco.data --cfg cfg/yolov3.cfg --weights yolov3.weights
load network
done!
load weightsfile
done!
/usr/local/lib/python3.5/dist-packages/skimage/transform/_warps.py:110: UserWarning: Anti-aliasing will be enabled by default in skimage 0.15 to avoid aliasing artifacts when down-sampling images.
warn("Anti-aliasing will be enabled by default in skimage 0.15 to "
/usr/local/lib/python3.5/dist-packages/skimage/transform/_warps.py:110: UserWarning: Anti-aliasing will be enabled by default in skimage 0.15 to avoid aliasing artifacts when down-sampling images.
warn("Anti-aliasing will be enabled by default in skimage 0.15 to "
/usr/local/lib/python3.5/dist-packages/torch/nn/modules/upsampling.py:122: UserWarning: nn.Upsampling is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.Upsampling is deprecated. Use nn.functional.interpolate instead.")
Traceback (most recent call last):
File "sparsity_train.py", line 154, in
train()
File "sparsity_train.py", line 100, in train
loss = model(imgs, targets)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/zweistein/桌面/yolov3-network-slimming/yolomodel.py", line 347, in forward
x, *losses = self.module_list[i][0](x, targets)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/zweistein/桌面/yolov3-network-slimming/yolomodel.py", line 107, in forward
precision = float(nCorrect / nProposals)
ZeroDivisionError: division by zero
请问应该如何解决

剪枝之后的模型预测不出来任何东西

我直接用的new_prune.py对darknet官方自带的yolov3.weights模型剪枝。

剪枝命令

python new_prune.py --cfg cfg/yolov3.cfg --weights yolov3.weights

检测命令

darknet detector test data/coco.data cfg/prune_yolov3.cfg  prune_yolov3.weights

检测测试图片是dog.jpg
直接显示了狗的那张图片,但是没有检测框出现,dos窗口也没有输出预测。使用原始的cfg和权重文件是可以检测得到的。

Can this code runs yolov3-tiny?

Hi @talevolano,

For yolov3.cfg, it works. Yet I can make yolov3-tiny work with this code by simply change the cmd to point to yolov3.cfg as well as yolov3.weights. I would like to inquire whether or not this implementation can access yolov3-tiny.cfg with yolov3-tiny.weights? Thanks.

dontprune的问题

for k,m in enumerate(model.modules()):
if isinstance(m, shortcutLayer):
x= k+m.froms-8
donntprune.append(x)
x = k-3
donntprune.append(x)
#print(donntprune)

有谁知道36行最后的-8是怎么来的吗?

测试demo

请问作者能提供下测试demo吗?

Syntax error

Hello,
I have a tiny yolov3 architecture that I want to optimize, I wrote the following in the terminal:

python sparsity_train.py -sr --s 0.0001 --image_folder obj.data --cfg cfg/yolov3-tiny_obj1.cfg --weights yolov3-tiny_obj1_15000.weights

I got the error:

Traceback (most recent call last):
File "sparsity_train.py", line 3, in
from yolomodel import *
File "/home/abanoub/Desktop/yolov3-network-slimming-master/yolomodel.py", line 347
x, *losses = self.module_list[i][0](x, targets)
^
SyntaxError: invalid syntax

Any idea why this is occuring? Does the code need a specific python version?
Help is much appreciated! Cheers

train过程中w和h变得很大

训练过程中出现了下图这样的问题:
[Epoch 0/2000, Batch 114/117264] [Losses: x 0.567772, y 0.612249, w 25922512289561715933184.000000, h 142289888154081864712192.000000, conf 45.791771, cls 12.687886, total 168212400443643580645376.000000, recall: 0.00000, precision: 0.00000]

同时报错如下:
Traceback (most recent call last):
File "sparsity_train.py", line 154, in
train()
File "sparsity_train.py", line 100, in train
loss = model(imgs, targets)
File "/home/volcano/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/volcano/expand/yolov3-network-slimming/yolomodel.py", line 353, in forward
x, *losses = self.module_list[i][0](x, targets)
File "/home/volcano/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/volcano/expand/yolov3-network-slimming/yolomodel.py", line 134, in forward
loss_conf = self.bce_loss(pred_conf[conf_mask_false], tconf[conf_mask_false]) +
File "/home/volcano/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/volcano/anaconda3/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 498, in forward
return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)
File "/home/volcano/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2051, in binary_cross_entropy
input, target, weight, reduction_enum)
RuntimeError: Assertion `x >= 0. && x <= 1.' failed. input value should be between 0~1, but got nan at /opt/conda/conda-bld/pytorch_1565272271120/work/aten/src/THNN/generic/BCECriterion.c:62

有人知道如何解决吗?感谢!

"RuntimeError: reduce failed to synchronize: device-side assert triggered" while trainging

Traceback (most recent call last):
File "sparsity_train.py", line 154, in
train()
File "sparsity_train.py", line 100, in train
loss = model(imgs, targets)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/chensy/QW/yolov3-network-slimming/yolomodel.py", line 352, in forward
x, *losses = self.module_list[i][0](x, targets)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/chensy/QW/yolov3-network-slimming/yolomodel.py", line 133, in forward
loss_conf = self.bce_loss(pred_conf[conf_mask_false], tconf[conf_mask_false]) +
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py", line 512, in forward
return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 2113, in binary_cross_entropy
input, target, weight, reduction_enum)
RuntimeError: reduce failed to synchronize: device-side assert triggered

关于稀疏化训练时的损失函数

注意到 updateBN() 函数里权值更新是:
add_(torch.sign(m.weight.data)),
这个地方有点没大看明白,为啥不是add_(torch.abs(m.weight.data))啊?
感觉自己有些没参透论文,烦请大佬们帮忙解释一下啊,多谢多谢

稀疏化训练时出现这个错,没明白原因。希望大佬给点建议RuntimeError: cannot perform reduction function max

load network
done!
load weightsfile
done!
Traceback (most recent call last):
File "sparsity_train.py", line 154, in
train()
File "sparsity_train.py", line 100, in train
loss = model(imgs, targets)
File "/home/caesar/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/caesar/deep/yolov3_limming/yolomodel.py", line 353, in forward
x, *losses = self.module_list[i][0](x, targets)
File "/home/caesar/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/caesar/deep/yolov3_limming/yolomodel.py", line 136, in forward
loss_cls = (1 / nB) * self.ce_loss(pred_cls[mask], torch.argmax(tcls[mask], 1))
RuntimeError: cannot perform reduction function max on tensor with no elements because the operation does not have an identity

关于训练时间的问题

有人试过训练起来大约是多少?
硬件平台是什么?
用的coco数据集多大量级?
时间大概是多少?

谢谢!

关于稀疏化训练基础网络的问题

po主你好,我采用yolov3完整的网络进行稀疏化训练,训练后裁剪80%效果,然后微调,虽然有损失,效果还可以。但是我采用手动裁剪过的网络作为基础网络进行稀疏化训练(未稀疏化之前效果可以),稀疏化完,不裁剪时的网络效果就很不好了,裁剪完微调,precision一直很低。求教基础网络必须是完整的yolov3吗?

tensor dimension mismatch

Dear @talebolano .
I got below error. Is this my problem? and how can i fix this problem if not.

E:\TSpring(YOLO)4classes-new2>python sparsity_train.py -sr --s 0.0001 --image_folder 4classes-new2.data --cfg darknet53.cfg --weights darknet53_608_random_60000.weights --reso 608
load network
done!
load weightsfile
done!
C:\Python36\lib\site-packages\skimage\transform\_warps.py:110: UserWarning: Anti-aliasing will be enabled by default in skimage 0.15 to avoid aliasing artifacts when down-sampling images.
  warn("Anti-aliasing will be enabled by default in skimage 0.15 to "
C:\Python36\lib\site-packages\skimage\transform\_warps.py:110: UserWarning: Anti-aliasing will be enabled by default in skimage 0.15 to avoid aliasing artifacts when down-sampling images.
  warn("Anti-aliasing will be enabled by default in skimage 0.15 to "
Traceback (most recent call last):
  File "sparsity_train.py", line 154, in <module>
    train()
  File "sparsity_train.py", line 100, in train
    loss = model(imgs, targets)
  File "C:\Python36\lib\site-packages\torch\nn\modules\module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "E:\TSpring(YOLO)4classes-new2\yolomodel.py", line 347, in forward
    x, *losses = self.module_list[i][0](x, targets)
  File "C:\Python36\lib\site-packages\torch\nn\modules\module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "E:\TSpring(YOLO)4classes-new2\yolomodel.py", line 55, in forward
    prediction = x.view(nB, nA, self.bbox_attrs, nG, nG).permute(0, 1, 3, 4, 2).contiguous()
RuntimeError: invalid argument 2: size '[1 x 9 x 9 x 19 x 19]' is invalid for input with 9747 elements at ..\aten\src\TH\THStorage.cpp:84

关于训练问题

请问有训练原始模型程序吗?我想看下原始模型训练的效果,然后再压缩应该才有意义

关于在yolov2上使用的问题?

我用你的代码在yolov2稀疏化训练时(yolov2 608输入,两类目标),
1)报错提示:
损失非常大,然后崩了,梯度爆炸。还有就是utils.py里的build_targets 提示index=19 in dimension 3 超过边界。
2)源码yolomodel.py中
self.losses["recall"] /= 3
self.losses["precision"] /= 3
为什么要除于3,是因为yolov3有三个输出层么? 那么yolov2的话是不是需要修改呢

yolov3-tiny training error

你好,在訓練yolov3-tiny報錯:
load network
done!
load weightsfile
done!
/home/andy0212/anaconda3/lib/python3.7/site-packages/skimage/transform/_warps.py:110: UserWarning: Anti-aliasing will be enabled by default in skimage 0.15 to avoid aliasing artifacts when down-sampling images.
warn("Anti-aliasing will be enabled by default in skimage 0.15 to "
/home/andy0212/anaconda3/lib/python3.7/site-packages/skimage/transform/_warps.py:110: UserWarning: Anti-aliasing will be enabled by default in skimage 0.15 to avoid aliasing artifacts when down-sampling images.
warn("Anti-aliasing will be enabled by default in skimage 0.15 to "
/home/andy0212/anaconda3/lib/python3.7/site-packages/torch/nn/_reduction.py:16: UserWarning: reduction='elementwise_mean' is deprecated, please use reduction='mean' instead.
warnings.warn("reduction='elementwise_mean' is deprecated, please use reduction='mean' instead.")
/home/andy0212/anaconda3/lib/python3.7/site-packages/torch/nn/modules/upsampling.py:129: UserWarning: nn.Upsample is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.{} is deprecated. Use nn.functional.interpolate instead.".format(self.name))
Traceback (most recent call last):
File "sparsity_train.py", line 154, in
train()
File "sparsity_train.py", line 100, in train
loss = model(imgs, targets)
File "/home/andy0212/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/andy0212/Documents/yolov3-network-slimming/yolomodel.py", line 332, in forward
x = torch.cat((map1, map2), 1)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 26 and 24 in dimension 2 at /opt/conda/conda-bld/pytorch_1549635019666/work/aten/src/THC/generic/THCTensorMath.cu:83

About multi GPUs used

If there anyone used several gpus to run the sparsity_train.py? hope get your help ,thanks

关于donntprune的一个疑问

donntprune = []
for k, m in enumerate(model.modules()):
if isinstance(m, shortcutLayer):
x = k + m.froms - 8
donntprune.append(x)
x = k - 3
donntprune.append(x)

这段代码的x = k + m.froms - 8,m.froms应该是等于-3,那么x = k + m.froms - 8等价于x = k - 11,是这样吗?也就是当前层之前的倒数第11层,是这样吗?不太明白,望指教

RuntimeError: shape is invalid for input of size ...

Hello,
I followed the steps provided in the readme. But the resulting weights and config file throw an error when trying to use them in an yolov3 implementation.

Traceback (most recent call last):
  File "detect.py", line 48, in <module>
    model.load_darknet_weights(opt.weights_path)
  File "/home/_/Desktop/PyTorch-YOLOv3/models.py", line 315, in load_darknet_weights
    conv_w = torch.from_numpy(weights[ptr : ptr + num_w]).view_as(conv_layer.weight)
RuntimeError: shape '[125, 106, 3, 3]' is invalid for input of size 14720

Anyone has an idea why? I am using a trained model with 25 classes and yolov3-spp i followed the issue on yolov3-tiny to make it working. For the training i used batch=5, subdivisions=10, epoch=1, size=608 and dataset with 300.000 Pictures. Thanks.

coco.data

请问,COCO.data文件里面是什么?能给我一个示例吗,谢谢。

weights

hello, could you please open your slimmed weights?

mAP

请问大家有测过剪枝之后的map吗,还是只看得precision和recall呀

稀疏化训练速度非常缓慢

你好 我在稀疏化训练的时候,发现训练速度非常缓慢,比直接在darknet上训练,速度慢了好多,请问这种情况正常吗?

训练出问题

这代码可以正常训练?为什么我的报错误呢?
Traceback (most recent call last):
File "/media/data/liuben/optimize/yolov3-network-slimming/sparsity_train.py", line 164, in
train()
File "/media/data/liuben/optimize/yolov3-network-slimming/sparsity_train.py", line 110, in train
loss = model(imgs, targets)
File "/home/teamway/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/media/data/liuben/optimize/yolov3-network-slimming/yolomodel.py", line 353, in forward
x, *losses = self.module_list[i][0](x, targets)
File "/home/teamway/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/media/data/liuben/optimize/yolov3-network-slimming/yolomodel.py", line 107, in forward
precision = float(nCorrect / nProposals)
ZeroDivisionError: division by zero

稀疏训练 tensor not match

File "sparsity_train.py", line 159, in
load network
done!
load weightsfile
done!
train()
File "sparsity_train.py", line 105, in train
loss = model(imgs, targets)
File "/home/soc/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/soc/PROJECT/zyp/Slim/1-2-yolov3-network-slimming-master/yolomodel.py", line 352, in forward
x, *losses = self.module_list[i][0](x, targets)
File "/home/soc/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/soc/PROJECT/zyp/Slim/1-2-yolov3-network-slimming-master/yolomodel.py", line 130, in forward
loss_y = self.mse_loss(y[mask], ty[mask])
File "/home/soc/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/soc/.local/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 431, in forward
return F.mse_loss(input, target, reduction=self.reduction)
File "/home/soc/.local/lib/python3.6/site-packages/torch/nn/functional.py", line 2203, in mse_loss
expanded_input, expanded_target = torch.broadcast_tensors(input, target)
File "/home/soc/.local/lib/python3.6/site-packages/torch/functional.py", line 52, in broadcast_tensors
return torch._C._VariableFunctions.broadcast_tensors(tensors)
RuntimeError: The size of tensor a (5) must match the size of tensor b (3) at non-singleton dimension 0

training error!

Hi:
我在训练的时候到这里报错:

loss_cls = (1 / nB) * self.ce_loss(pred_cls[mask], torch.argmax(tcls[mask], dim=1))

Traceback (most recent call last):
File "/home/lc/work/yolov3-network-slimming/sparsity_train.py", line 159, in
train()
File "/home/lc/work/yolov3-network-slimming/sparsity_train.py", line 107, in train
loss = model(imgs, targets)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/lc/work/yolov3-network-slimming/yolomodel.py", line 365, in forward
x, *losses = self.module_list[i][0](x, targets)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/lc/work/yolov3-network-slimming/yolomodel.py", line 147, in forward
print(torch.argmax(tcls[mask], dim=1))
File "/usr/local/lib/python3.6/dist-packages/torch/functional.py", line 374, in argmax
return torch._argmax(input, dim, keepdim)
RuntimeError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

稀疏训练指标异常

训练过程如下

[Epoch 0/20, Batch 13/2302] [Losses: x 0.204015, y 0.214396, w 4.140403, h 1.606703, conf 30.286903, cls 0.171045, total 36.623466, recall: 0.00000, precision: 0.00000]
[Epoch 0/20, Batch 23/2302] [Losses: x 0.216009, y 0.182937, w 2.846014, h 1.343955, conf 28.463878, cls 0.171810, total 33.224602, recall: 0.02047, precision: 0.07926]
[Epoch 0/20, Batch 39/2302] [Losses: x 0.239066, y 0.192233, w 2.735792, h 2.118299, conf 28.024119, cls 0.179756, total 33.489265, recall: 0.00544, precision: 0.05238]
[Epoch 0/20, Batch 73/2302] [Losses: x 0.209569, y 0.173580, w 0.820454, h 0.410769, conf 20.885465, cls 0.170385, total 22.670223, recall: 0.02878, precision: 0.12886]
[Epoch 0/20, Batch 83/2302] [Losses: x 0.253361, y 0.169483, w 1.047872, h 0.685030, conf 19.446521, cls 0.170343, total 21.772610, recall: 0.04167, precision: 0.17255]
[Epoch 0/20, Batch 89/2302] [Losses: x 0.246635, y 0.184126, w 1.413699, h 1.299393, conf 15.728958, cls 0.174179, total 19.046989, recall: 0.03855, precision: 0.15952]
[Epoch 0/20, Batch 102/2302] [Losses: x 0.257247, y 0.206709, w 0.842265, h 0.752688, conf 12.285303, cls 0.169251, total 14.513463, recall: 0.04132, precision: 0.06991]
......
[Epoch 19/20, Batch 2261/2302] [Losses: x 0.158596, y 0.126152, w 0.243440, h 0.186063, conf 0.602620, cls 0.164563, total 1.481433, recall: 0.53978, precision: 0.06947]
[Epoch 19/20, Batch 2271/2302] [Losses: x 0.169528, y 0.110445, w 0.261716, h 0.266443, conf 0.320310, cls 0.159760, total 1.288202, recall: 0.50892, precision: 0.09175]
[Epoch 19/20, Batch 2276/2302] [Losses: x 0.131687, y 0.120498, w 0.229741, h 0.150297, conf 0.556042, cls 0.165255, total 1.353521, recall: 0.56410, precision: 0.06603]
[Epoch 19/20, Batch 2285/2302] [Losses: x 0.151971, y 0.106440, w 0.196245, h 0.137052, conf 0.336564, cls 0.161891, total 1.090163, recall: 0.57516, precision: 0.04400]
0~20%:0.742166,20~40%:0.979765,40~60%:0.985903,60~80%:0.990159,80~100%:2.850679

看上不不正常,与#2 (comment)
相似,请问有人遇到过吗,有什么解决方法?

yolov3-VOC

Can I train yolov3-VOC network for sparse calls? The following error occurs when using my own data:
ValueError: setting an array element with a sequence.

我训练自己的数据集在微调的时候碰到这种错误

Traceback (most recent call last):
File "sparsity_train.py", line 161, in
train()
File "sparsity_train.py", line 66, in train
model.load_weights(args.weightsfile)
File "/home/lizhaokun/code/yolo/yolov3-network-slimming-master/yolomodel.py", line 427, in load_weights
conv_weights = conv_weights.view_as(conv.weight.data)
RuntimeError: shape '[1024, 512, 3, 3]' is invalid for input of size 2024224
我的类别是2类,cfg的类别和yolo层上面那一层的filter个数我都改过了,有人知道是为什么吗?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.