tanluren / yolov3-channel-and-layer-pruning Goto Github PK

View Code? Open in Web Editor NEW

1.5K 1.5K 446.0 1.55 MB

yolov3 yolov4 channel and layer pruning, Knowledge Distillation 层剪枝，通道剪枝，知识蒸馏

License: Apache License 2.0

Shell 4.45% Python 94.30% Dockerfile 1.25%

yolov3-channel-and-layer-pruning's People

Contributors

Stargazers

Watchers

Forkers

weycui zyg11 lukaisheng1203 piseyyou ai-jit crazyvertigo rotorliu hajungong007 liuwenhaha jianyushu gaojie0105 brownsweater davidsonggithub anhuipl2010 yuxulingche weitaoatvison tigermachinelearning j201111100523 seeker1943 juzigithub gc5218112 lilin19890401 buendilong jinliemma tuq820 xiaojinu lanyastar matrixlover superqirui dennistang742 aohan12138 dreadlord1984 goodgoodstudy92 yangyin2016 871864580 zxj796314 kmoond ii0 hsdxm yinshunyao kkaory cclauss sulince sssheng123 pandinosaurus cscn89 panda781022 collector-m benjamesbabala hongbowei gracekafuu whklwhkl darkknightzh 666dzy666 pawopawo gm19900510 yangtao-ai wen0618 tchigher liuhansen zhangwei730 gztangde hyqyoung shihangy r0use gchinanty leo-xxx git-manager daibin88 dodogoffy zrh0712 chenlei1976 baidu88vip mozpp jiangbingqing wulele2 guoxingyan wang-xinyu sailychen g921002 sakusss cxczzy wangkangchn zjysnow tony-tf baileyqbb wbercode qinhaihong-red xiefeiwhu liguang190223 zorrocai milort chase2816 qiuhui1991 hxl1990 shiquan0304 dyqgithub wendinghe ailihong felixzhang7

yolov3-channel-and-layer-pruning's Issues

yolov3 8位量化的实验

@tanluren，
您好！

感谢您的分享！请问您是否有做过yolov3 8位量化的实验？谢谢

祝好！

EfficientDet 可以用您这个方法吗?(单目标检测)

我是普通单目标的检测,感觉用不着那么多的参数.

ImportError: No module named torch.distributed

这个问题怎么解决啊？

我使用和您相同的数据集以及参数训练，channel prun的结果相仿，但是layer prun后的map在微调前只有20，可能是什么问题呢？

can you help me to solve it ?WARNING: non-finite loss, ending training tensor([ nan, 0.22070, 0.00000, nan], device='cuda:0')

WARNING: non-finite loss, ending training tensor([ nan, 0.22070, 0.00000, nan], device='cuda:0')

剪枝mAP=0？inference没变？正确的流程应该怎样？

我使用如下命令进行剪枝
`python train.py --cfg cfg/yolov3.cfg --data data
--weights weights/last.pt
--epochs 100 --batch-size 16 -sr --s 0.001 --prune 1

python shortcut_prune.py --cfg cfg/yolov3.cfg --data data
--weights weights/last.pt
--percent 0.6`

稀疏训练没有100epochs就停止了。
然后结果非常糟糕

请问有何建议？
是需要更多的稀疏训练？然后需要fine tune吗。剪枝之后，不训练，map也可以正常吗？
inference没变？是因为我使用了显卡吗。本来很快？所以剪枝加速不明显？（使用的1080ti单卡）

通道剪枝错误, get_mask2 未找到..

 Epoch   gpu_mem      GIoU       obj       cls     total      soft   targets  img_size

0%| | 0/309 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 534, in
train() # train normally
File "train.py", line 313, in train
idx2mask = get_mask2(model, prune_idx, 0.95)
NameError: name 'get_mask2' is not defined
0%| | 0/309 [00:00<?, ?it/s]
请问是不是命名有问题?
我改成 get_mask之后也报错

local variable 'labels' referenced before assignment

你好, 为什么我运行时会报这个错误
img, labels = load_mosaic(self, index)
File "O:\Deep_Learning\Dataset\YOLOv3\yolov3-channel-and-layer-pruning-master\utils\datasets.py", line 607, in load_mosaic
labels4.append(labels)
UnboundLocalError: local variable 'labels' referenced before assignment

激活值的特殊处理

在prune_utils.py中，对激活值的处理中，这一句next_bn.running_mean.data.sub_(offset)，这里为什么用的是sub，这里有点不明白？running_mean不应该是对数据统计的均值吗？

貌似有个小bug，输出的学习速率lr不是实时的

print('learning rate:',optimizer.param_groups[0]['lr']) 是在调整学习速率
lr = adjust_learning_rate(optimizer, 0.1, epoch, ni, nb)之前执行，打印的lr不是当前的lr，而是上一个epoch的学习速率lr

bn_weights/hist

请问bn_weights要训练到多少代表稀疏化完全，看示例图中是300+epoch 值在0.2以下？

请教下关于VOC数据集稀疏化训练的问题

VOC数据集稀疏化训练map很低，使用的yolov3.weights，训练100步map只有10几，loss也在震荡，是不是稀疏化训练只适合单一目标，例如：手的数据集

在不用稀疏化训练时，就能保证很高的map，大佬请教下

which anchor match the groudtruth box？

hello ，author，i use your code to detect the object but occur some problems ，there are two box arround head and cant deal with it by nms。
i want to detect person hand head 。 i think groudtruth match the anchor which has largest IOU with groudtruth between different yolo layer ，the code may do this by matching the groudtruth with anchor in every yolo layer and this make some problem in the detection result，hope answer，thx。

用官网yolo（C++）调用剪枝后的cfg和weights文件，为什么什么都检测不到

大佬，请教下用官网yolo（C++）调用剪枝后的cfg和weights文件，为什么什么都检测不到

RuntimeError: CUDA out of memory. Tried to allocate 22.00 MiB (GPU 0; 10.91 GiB total capacity; 10.03 GiB already allocated; 18.12 MiB free; 10.17 GiB reserved in total by PyTorch)

gpu have enough space,but it always show this matter

源码实现有问题啊？

torch安装需要python64位，但是报错OSError: [WinError 193] %1 不是有效的 Win32 应用程序。
改正32位的python版本，torch又会报错，报错为找不到torch._C

为什么稀疏训练以后权重文件无变化？

no question

请问为什么用官方的权重还要再用新的data训练之后才进行剪枝

关于shortcut_prune的bn剪枝规则有些疑问

十分感谢大佬贡献代码，让我们有学习的机会。
不好意思。我又来提问了。十分感谢您的回答。
有两点疑问
1，代码里面，L307，如果是shortcut前一层conv，使用from的conv层的mask。但是这个mask 不能代表 “shortcut前一层conv” 这层的卷积的重要顺序了。

我想了一个规则，不知道是否可行：
使用“from的conv层的mask”中剪枝的数量，再造一个mask2，使sum(mask2)=sum(mask)；然后对“shortcut前一层conv”这层的卷积排序，取最大的。
但是随之带来的问题是，两个mask 相加之后通道数就不一样了。所以后面可能要修改创建网络的规则。

第2个疑问
虽然我们使用用bn的剪枝规则与《Learning Efficient Convolutional Networks Through Network Slimming》论文中是一样的。但是不知道您是否有思考过。
剪枝的阈值是通过所有bn的weight排序得到的。这真的公平吗？
①网络最前面的数值，更靠近图像像素值，最后一层更靠近类别概率。bn的weight不一定分布相同
②在网络中间有shortcut，两个卷积像素值叠加后，weight参数变大。可能会影响bn的weight。
不知道是否存在①②的现象。

issues35
期待您的回复。十分感谢

这种剪枝只对单目标检测有效？

用官方yolo跑的自己数据集香烟和手机的目标检测，mAp大概在70多
然后进行稀疏训练，第一个epoch的mAp就变成0了？
请问是不是只对单目标检测效果较好啊？

开启prebias选项训练出错

步骤：稀疏化训练

ssh://[email protected]:8026/usr/local/bin/python -u /project/pytorch-yolov3/train_prune.py --data=data/person_1cls.data --batch-size=4 --cfg=cfg/yolov3-spp-1cls-a2.cfg --weights=weights/yolov3-spp.weights --device=0 --prebias -sr --s=0.001 --prune=0
Namespace(accumulate=2, adam=False, arc='defaultpw', batch_size=4, bucket='', cache_images=False, cfg='cfg/yolov3-spp-1cls-a2.cfg', data='data/person_1cls.data', device='0', epochs=273, evolve=False, img_size=416, img_weights=False, multi_scale=False, name='', nosave=False, notest=False, prebias=True, prune=0, rect=False, resume=False, s=0.001, sr=True, t_cfg='', t_weights='', transfer=False, var=None, weights='weights/yolov3-spp.weights')
Using CUDA device0 _CudaDeviceProperties(name='GeForce GTX 1660', total_memory=5941MB)

loaded weights from weights/yolov3-spp.weights 

normal sparse training 
Model Summary: 225 layers, 6.25733e+07 parameters, 54 gradients
Starting prebias for 1 epochs...

     Epoch   gpu_mem      GIoU       obj       cls     total      soft    rratio   targets  img_size
  0%|                                                  | 0/1524 [00:00<?, ?it/s]learning rate: 1e-06
Traceback (most recent call last):
  File "/project/pytorch-yolov3/train_prune.py", line 537, in <module>
    prebias()  # optional
  File "/project/pytorch-yolov3/train_prune.py", line 484, in prebias
    train()  # transfer-learn yolo biases for 1 epoch
  File "/project/pytorch-yolov3/train_prune.py", line 380, in train
    BNOptimizer.updateBN(sr_flag, model.module_list, opt.s, prune_idx, idx2mask)
  File "/project/pytorch-yolov3/utils/prune_utils.py", line 148, in updateBN
    bn_module.weight.grad.data.add_(s * torch.sign(bn_module.weight.data))  # L1
AttributeError: 'NoneType' object has no attribute 'data'
  0%|                                                  | 0/1524 [00:02<?, ?it/s]

Process finished with exit code 1

关掉prebias后可以训练，请问大家有没有这个问题？

python3 detect.py .cfg .weights cannot detect image

I use yolov3.cfg, yolo v3.weights or other cfg, weights(author provided), it cannot detect any objects.

为什么prune.py跑出的map和之前train顺便跑出来的不一样？

train里的精度

prune显示的精度，p下降了好多。

+------------+----------+----------+
| Metric | Before | After |
+------------+----------+----------+
| mAP | 0.882270 | 0.881940 |
| Parameters | 61523734 | 7517869 |
| Inference | 0.0431 | 0.0219 |
+------------+----------+----------+

剪枝时FileNotFoundError: [Errno 2] No such file or directory: '/prune_0.15_keep_0.01_3_shortcut_usr/prune_0.15_keep_0.01_3_shortcut_cx/prune_0.15_keep_0.01_3_shortcut_darknetalexeyAB/prune_0.15_keep_0.01_3_shortcut_darknet-master/prune_0.15_keep_0.01_3_shortcut_names_data/prune_0.15_keep_0.01_3_shortcut_yolo-obj.cfg

剪枝命令：
python layer_channel_prune.py --cfg /usr/cx/darknetalexeyAB/darknet-master/names_data/yolo-obj.cfg --data /usr/cx/darknetalexeyAB/darknet-master/names_data/voc.data --weights weights/best.pt --global_percent 0.85 --layer_keep 0.01 --shortcuts 16

FileNotFoundError: [Errno 2] No such file or directory: '/prune_0.15_keep_0.01_3_shortcut_usr/prune_0.15_keep_0.01_3_shortcut_cx/prune_0.15_keep_0.01_3_shortcut_darknetalexeyAB/prune_0.15_keep_0.01_3_shortcut_darknet-master/prune_0.15_keep_0.01_3_shortcut_names_data/prune_0.15_keep_0.01_3_shortcut_yolo-obj.cfg

2080ti上面训练，显存溢出

2080ti上面，网络输入大小416，把batch size设为8都会显存溢出，这是正常的吗。

大神,请教下,用自己数据集用官方的weight权重训练完成后,剪枝了,还能正常转成pb吗?

大神,请教下,用自己数据集用官方的weight权重训练完成后,剪枝了,还能正常转成pb吗?有没有现有的直接pb权重文件剪枝的技术推荐一下?

yolov3-spp剪枝

我想做yolov3-spp的剪枝，（就是不同网络的剪枝），有哪些参数要修改呢？
yolov3-spp只比yolov3多几个maxpooling

目前我修改了ignore_idx

def parse_module_defs(model):
    if hasattr(model, 'module'):
        print('muti-gpus sparse')
        module_defs = model.module.module_defs
    else:
        print('single-gpu sparse')
        module_defs = model.module_defs
    CBL_idx = []
    Conv_idx = []
    for i, module_def in enumerate(module_defs):
        if module_def['type'] == 'convolutional':
            if module_def['batch_normalize'] == '1':
                CBL_idx.append(i)
            else:
                Conv_idx.append(i)
    ignore_idx = set()
    ##不裁剪shortcut之前的两个layer
    for i, module_def in enumerate(module_defs):
        if module_def['type'] == 'shortcut':
            ignore_idx.add(i-1)
            identity_idx = (i + int(module_def['from']))
            if module_defs[identity_idx]['type'] == 'convolutional':
                ignore_idx.add(identity_idx)
            elif module_defs[identity_idx]['type'] == 'shortcut':
                ignore_idx.add(identity_idx - 1)
    #上采样层前的卷积层不裁剪，yolov3的CBL_idx[-1]==104,可判断是否为yolov3-spp
    if CBL_idx[-1]==104:
        print('sparse: yolov3')
        ignore_idx.add(84)
        ignore_idx.add(96)
    else:
        print('sparse: yolov3-spp')
        #spp前后两层都不剪枝（77,84）
        ignore_idx.add(77)
        ignore_idx.add(84)
        ignore_idx.add(91)
        ignore_idx.add(103)

    prune_idx = [idx for idx in CBL_idx if idx not in ignore_idx]

    return CBL_idx, Conv_idx, prune_idx

compact_conv.weight.data = tmp[out_channel_idx, :, :, :].clone()报错

你好，我在执行layer_channel_prune.py时出现了报错：
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda ->auto::operator()(int)->auto: block: [32,0,0], thread: [87,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
请问有人知道原因吗

ModuleNotFoundError: No module named 'tensorboard'

大佬们，我再在稀疏化训练过程中发现 No module named 'tensorboard' 我使用pip3 install tensorboard 后仍然报错随后使用 pip3 install future 结果仍然报出相同错误

这个必须针对特定数据集微调吗

如题，如果我想做一个通用的detector怎么办呀

RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one.

我在进行sparsity时，遇到了如下报错，请问是什么原因导致的呢？谢谢。
Error LOG:
Reading labels (14363 found, 0 missing, 9 empty for 14372 images): 100%|████████████████████████████████████████████| 14372/14372 [02:20<00:00, 102.01it/s]
Model Summary: 225 layers, 6.42767e+07 parameters, 6.42767e+07 gradients
Starting training for 120 epochs...

 Epoch   gpu_mem      GIoU       obj       cls     total      soft    rratio   targets  img_size

0%| | 0/450 [00:00<?, ?it/s]learning rate: 1e-06
0/119 6.96G 1.54 1.88 0.968 4.38 0 0 101 416: 100%|██████████████| 450/450 [06:35<00:00, 1.14it/s]
Class Images Targets P R mAP F1: 100%|██████████████████████████████████| 113/113 [02:52<00:00, 1.53s/it]
all 3.59e+03 1.01e+05 0.322 0.453 0.365 0.363

 Epoch   gpu_mem      GIoU       obj       cls     total      soft    rratio   targets  img_size

0%| | 0/450 [00:00<?, ?it/s]learning rate: 0.0011625
Traceback (most recent call last):
File "train.py", line 542, in
train() # train normally
File "train.py", line 348, in train
pred = model(imgs)
File "/devdata/liqm/Tools/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/devdata/liqm/Tools/miniconda3/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 459, in forward
self.reducer.prepare_for_backward([])
RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by (1) passing the keyword argument find_unused_parameters=True to torch.nn.parallel.DistributedDataParallel; (2) making sure all forward function outputs participate in calculating loss. If you already have done the above two steps, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's forward function. Please include the loss function and the structure of the return value of forward of your module when reporting this issue (e.g. list, dict, iterable). (prepare_for_backward at /opt/conda/conda-bld/pytorch_1573049387353/work/torch/csrc/distributed/c10d/reducer.cpp:518)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7f0d6836f687 in /devdata/liqm/Tools/miniconda3/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10d::Reducer::prepare_for_backward(std::vector<torch::autograd::Variable, std::allocatortorch::autograd::Variable > const&) + 0x7b7 (0x7f0d6de68667 in /devdata/liqm/Tools/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #2: + 0x7cfca1 (0x7f0d6de56ca1 in /devdata/liqm/Tools/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #3: + 0x2065e6 (0x7f0d6d88d5e6 in /devdata/liqm/Tools/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #4: _PyMethodDef_RawFastCallKeywords + 0x254 (0x55be4fe14744 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #5: _PyCFunction_FastCallKeywords + 0x21 (0x55be4fe14861 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #6: _PyEval_EvalFrameDefault + 0x52f8 (0x55be4fe806e8 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #7: _PyEval_EvalCodeWithName + 0x2f9 (0x55be4fdc4539 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #8: _PyFunction_FastCallDict + 0x1d5 (0x55be4fdc5635 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #9: _PyObject_Call_Prepend + 0x63 (0x55be4fde3e53 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #10: PyObject_Call + 0x6e (0x55be4fdd6dbe in /devdata/liqm/Tools/miniconda3/bin/python)
frame #11: _PyEval_EvalFrameDefault + 0x1e42 (0x55be4fe7d232 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #12: _PyEval_EvalCodeWithName + 0x2f9 (0x55be4fdc4539 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #13: _PyFunction_FastCallDict + 0x1d5 (0x55be4fdc5635 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #14: _PyObject_Call_Prepend + 0x63 (0x55be4fde3e53 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #15: + 0x16ba3a (0x55be4fe1ba3a in /devdata/liqm/Tools/miniconda3/bin/python)
frame #16: _PyObject_FastCallKeywords + 0x49b (0x55be4fe1c8fb in /devdata/liqm/Tools/miniconda3/bin/python)
frame #17: _PyEval_EvalFrameDefault + 0x4a96 (0x55be4fe7fe86 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #18: _PyEval_EvalCodeWithName + 0xac9 (0x55be4fdc4d09 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #19: _PyFunction_FastCallKeywords + 0x387 (0x55be4fe13f57 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #20: _PyEval_EvalFrameDefault + 0x416 (0x55be4fe7b806 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #21: _PyEval_EvalCodeWithName + 0x2f9 (0x55be4fdc4539 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #22: PyEval_EvalCodeEx + 0x44 (0x55be4fdc5424 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #23: PyEval_EvalCode + 0x1c (0x55be4fdc544c in /devdata/liqm/Tools/miniconda3/bin/python)
frame #24: + 0x22ab74 (0x55be4fedab74 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #25: PyRun_FileExFlags + 0xa1 (0x55be4fee4eb1 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #26: PyRun_SimpleFileExFlags + 0x1c3 (0x55be4fee50a3 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #27: + 0x236195 (0x55be4fee6195 in /devdata/liqm/Tools/miniconda3/bin/python)
frame #28: _Py_UnixMain + 0x3c (0x55be4fee62bc in /devdata/liqm/Tools/miniconda3/bin/python)
frame #29: __libc_start_main + 0xf0 (0x7f0d71697830 in /lib/x86_64-linux-gnu/libc.so.6)
frame #30: + 0x1db062 (0x55be4fe8b062 in /devdata/liqm/Tools/miniconda3/bin/python)

训练结束后，程序无法退出的问题

作者您好，我制作了一个很小的数据集，尝试把程序跑通，运行了如下指令：
python train.py --cfg cfg/yolov3-nameplate.cfg --data data/nameplate.data --weights weights/darknet53.conv.74 --epochs 2 --batch-size 4
训练过程正常（checkpoint、TensorBoard都正常），但程序运行到最后无法退出，不知道是什么问题？
我在windows cpu环境下跑的，pytorch1.3，训练集16张图，测试集4张图

python -c "from models import *; convert('cfg/yolov3.cfg', 'weights/last.pt')"

您好，python -c "from models import *; convert('cfg/yolov3.cfg', 'weights/last.pt')" 这里在模型转换的时候，第一个参数是不是要使用cfg/my_cfg.cfg？

训练异常中断，弹出“Python已停止工作”的窗口

Reading labels (8952 found, 0 missing, 0 empty for 8967 images): 100%|███████████▉| 8952/8967 [00:37<00:00,
Reading labels (8967 found, 0 missing, 0 empty for 8967 images): 100%|████████████| 8967/8967 [00:37<00:00, 237.71it/s]
Model Summary: 222 layers, 6.1626e+07 parameters, 6.1626e+07 gradients
Starting training for 273 epochs...

 Epoch   gpu_mem      GIoU       obj       cls     total      soft   targets  img_size

0%| | 0/561 [00:00<?, ?it/s]learning rate: 1e-06

(Pytorch) F:\DeepLearning\YOLO优化\yolov3-channel-and-layer-pruning-master>

你好，我训练开始不久程序就崩溃了，然后弹出了“Python已停止工作”的窗口。请问这是什么原因呢，该如何解决？

AttributeError: 'DistributedDataParallel' object has no attribute 'module_list'

for idx in prune_idx:
bn_weights = gather_bn_weights(model.module_list, [idx])
tb_writer.add_histogram('before_train_perlayer_bn_weights/hist', bn_weights.numpy(), idx, bins='doane')

作者新加的这部分代码报错，如上

大佬您好，detect.py调用摄像头没有识别结果

u版的输入如下命令python detect.py --cfg cfg/yolov3.cfg --weights weights/yolov3.weights --source 0 --conf-thres 0.1可以调用摄像头且正常识别。
没看到您有介绍如何test或detect的教程，所以我在这里也这么使用，但是无法识别，如下图，cfg也改成test模式了。请问什么原因。

layer-keep

channel keep percent per layer 0.01这个参数太小了把

大佬修建过后的模型参数文件大小和耗时大概多少？

训练的时候维度出错。。

File "train.py", line 499, in
train() # train normally
File "train.py", line 327, in train
pred = model(imgs)
File "/home/112831/anaconda3.6/lib/python3.6/site-packages/torch/nn/modules/module.py", line 494, in call
result = self.forward(*input, **kwargs)
File "/home/112831/code/yolov3/models.py", line 244, in forward
x = module(x, img_size)
File "/home/112831/anaconda3.6/lib/python3.6/site-packages/torch/nn/modules/module.py", line 494, in call
result = self.forward(*input, **kwargs)
File "/home/112831/code/yolov3/models.py", line 149, in forward
p = p.view(bs, self.na, self.nc + 5, self.ny, self.nx).permute(0, 1, 3, 4, 2).contiguous() # prediction
RuntimeError: shape '[5, 3, 12, 52, 52]' is invalid for input of size 243360

yolov3-spp-pan-scale.cfg

想问一下，目前项目支持yolov3和yolov3-spp。如何修改使用yolov3-spp-pan-scale.cfg？基础训练和稀疏训练可以支持yolov3-spp-pan-scale.cfg，在层和通道剪枝运行layer_channel_prune.py 时AttributeError: 'NoneType' object has no attribute 'reshape'，如何修改这部分剪枝代码？

IndexError: index -1 is out of bounds for axis 0 with size 0

训练自己的数据集，同样的配置在darknet下面可以正常训练，但使用本项目跑完一轮就报错
Corrupt JPEG data: 2 extraneous bytes before marker 0xd9
0/99 8.39G 1.26 1.18 8.95 11.4 0 162 416: 100%|████████████████████████████████████████████████████████████████████████████| 1228/1228 [07:18<00:00, 2.80it/s]
Traceback (most recent call last):
File "train.py", line 527, in
train() # train normally
File "train.py", line 404, in train
save_json=final_epoch and epoch > 0 and 'coco.data' in data)
File "/home/work/deep_learning/yolov3-channel-and-layer-pruning/test.py", line 50, in test
dataset = LoadImagesAndLabels(test_path, img_size, batch_size)
File "/home/work/deep_learning/yolov3-channel-and-layer-pruning/utils/datasets.py", line 270, in init
nb = bi[-1] + 1 # number of batches
IndexError: index -1 is out of bounds for axis 0 with size 0
进入报错位置，打印了n和bi的值，分别为0和[]
请问有人碰到过吗

两类目标检测，精度为0

单类目标检测没问题，效果很好，但是同一个数据集，我检测两个类之后，刚开始训练精度非常低，0.008，经过几个epoch就降到0了，尝试过调小学习率，增大batch，还是这样。请问有大佬碰到过吗。

AssertionError: No labels found. Recommend correcting image and label paths.

loaded weights from weights/yolov3.weights
Reading labels: 100%|████████████████████████| 20/20 [00:00<00:00, 78251.94it/s]
Traceback (most recent call last):
File "train.py", line 497, in
train() # train normally
File "train.py", line 239, in train
cache_images=False if opt.prebias else opt.cache_images)
File "/home/nvidia/yolov3-channel-and-layer-pruning-master/utils/datasets.py", line 373, in init
assert nf > 0, 'No labels found. Recommend correcting image and label paths.'
AssertionError: No labels found. Recommend correcting image and label paths.

这个路径该怎么设置，我的数据没有问题，之前用darknet跑过，现在是放在本项目的路径下报这个错，放在darknet那边也是这样。
大神，下一步怎么处理？

yolov3检测OCR字符，精度很低

作者大神好！我能复现了您的hand数据集检测，但是在训练自己数据集时遇到了精度问题。我
正在用自己制作的ocr火车票数据集训练yolov3剪枝模型，数据预处理也是按照您给的那个链接来的，可是在第一步基础训练时就遇到了精度很低的情况，mAP只有0.135左右，测试了下漏检很多。我尝试过增大batchsize和减小学习率lr0，都没有改善，请问作者大神有什么好的建议？

基础训练的时候,prune默认是1吗

大神,你好, 想问下,在稀疏训练之前 , 必须要先经过基础训练是吧?,还有基础训练的时候prune默认是1,不用改把?

UnboundLocalError: local variable 'labels' referenced before assignment

yolov3改进：4层shortcuts
基础训练命令：python3 train.py --cfg /usr/cx/darknetalexeyAB/darknet-master/names_data/yolo-obj.cfg --data /usr/cx/darknetalexeyAB/darknet-master/names_data/voc.data --weights /usr/cx/darknetalexeyAB/darknet-master/yolov3.weights --epochs 100 --batch-size 20

shape '[512, 256, 3, 3]' is invalid for input of size 357392

你好, 我训练自己的数据,经过基础训练,稀疏训练, 和shortcut_prune剪枝, tensorboard 和剪枝结果都是正常的, 但是最后我用剪枝后的权重跑detect.py 时候报错了,shape '[512, 256, 3, 3]' is invalid for input of size 357392.. 请问经过shortcut_prune剪枝之后得到的权重是否还要处理?...

RuntimeError: Given groups=1, weight of size 32 0 3 3, expected input[1, 16, 208, 208] to have 0 channels, but got 16 channels instead

对tiny yolo剪枝的时候，报错如上？看过一些解决方法，发现并没有用。

剪枝时bn层的bias不用mask置零吗？

为什么这里被注释了

为什么不使用bn_module.bias.data.mul_(mask)