Giter Club home page Giter Club logo

pytorch-network-slimming's Introduction

Pytorch Network Slimming

TL;DR: Try mobilenetv3 first before you want to pruning a big model :)

This repository contains tools to make implement Learning Efficient Convolutional Networks Through Network Slimming on different backbones easier.

Features

  • Auto generate pruning schema. Tracker code is inspired by torch2trt
  • Channel pruning
  • Save and load pruned network graph and params (without pruning schema)
  • Export pruned model to onnx
  • Channels round up to 8x (for train/infer speed) (TODO: handle shortcuts channels round up)
  • Layer pruning

Supported project:

Supported model arches:

  • ResNet
  • RepVGG
  • vgg_bn
  • mobilenet_v2
  • mobilenet_v3(doc)

Supported layers:

  • Conv2d
  • depthwise/pointwise Conv2d
  • SqueezeExcitation(SE block)
  • Linear
  • BatchNorm2d
  • torch.cat

Quick start on cifar10

  1. Install pns as a python package: python3 setup.py develop

  2. Install requirements.txt, this is the dependency needed to train the cifar10 demo, pns itself only depends on pytorch

  3. Generate pruning schema of resnet18:

    python3 gen_schema.py --net resnet18 --save_path ./schema/resnet18.json
  4. Train on cifar10,and do fine tuning after apply Network Slimming.

    python3 main.py \
    --save_dir output \
    --dataset cifar10 \
    --net resnet18 \
    --epochs 120 \
    --batch_size 64 \
    --learning_rate 0.01 \
    --sparsity_train \
    --s 0.0001 \
    --fine_tune \
    --prune_schema ./schema/resnet18.json \
    --fine_tune_epochs 120 \
    --fine_tune_learning_rate 0.001 \
    --prune_ratio 0.75
  5. After apply Network Slimming, pruning result will be saved in checkpoint with _slim_pruning_result key and pruned params will be saved in another checkpoint.

Eval model without pruning result:

python3 main.py \
--dataset cifar10 \
--net resnet18 \
--ckpt ./output/last.ckpt

Eval model with pruning result:

python3 main.py \
--dataset cifar10 \
--net resnet18 \
--ckpt ./output/pruned_0.75/model_with_pruning_result.ckpt \
--ckpt_pruned ./output/pruned_0.75/last.ckpt

Export pruned model to ONNX

python3 main.py \
--net resnet18 \
--ckpt ./output/pruned_0.75/model_with_pruning_result.ckpt \
--ckpt_pruned ./output/pruned_0.75/last.ckpt \
--export_onnx_path ./output/pruned_0.75/last.onnx

Eval ONNX model(demo script only support CPU)

python3 main.py \
--dataset cifar10 \
--net resnet18 \
--ckpt ./output/pruned_0.75/last.onnx \
--device cpu

Benchmark onnx model performance

python3 benchmark.py \
/path/to/onnx_model1.onnx \
/path/to/onnx_model2.onnx

Experiments Result on CIFAR10

  • batch_size: 64
  • epochs: 120
  • sparsity: 1e-4
  • learning rate: 0.01
  • fine tune epochs: 120
  • fine tune learning rate: 0.01
  • onnx latency: input shape=1x3x224x224, OMP_NUM_THREADS=4, cpu=Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Net Prune Ratio Test Acc Test Acc Diff Params Params Reduce ONNX File size(MB) ONNX Latency ONNX Memory
0 RepVGG-A0-woid 0 93.6 0 7.8 M
1 RepVGG-A0-woid 0.75 93.88 +0.28 2.2 M 71.78%
2 RepVGG-A0-woid 0.5 93.86 +0.26 3.8 M 52.14%
3 resnet18 0 94.48 0 11.2 M
4 resnet18 0.75 94.14 -0.34 4.5 M 59.29%
5 resnet18 0.5 94.8 +0.32 3.5 M 68.83%
6 resnet50 0 94.65 0 23.5 M
7 resnet50 0.75 95.29 +0.64 5.3 M 77.59%
8 resnet50 0.5 95.42 +0.77 14.8 M 37.04%
9 vgg11_bn 0 91.7 0 128 M
10 vgg11_bn 0.75 89.85 -1.85 28.9 M 77.53%
11 vgg11_bn 0.5 91.46 -0.24 58.5 M 54.58%
12 mobilenet_v2 0 94.52 0 2.2M 8.5
13 mobilenet_v2 0.75 91.17 -3.35 661K 70.41% 2.6
14 mobilenet_v2(s=1e-5) 0 94.42 0 2.2M 8.5
15 mobilenet_v2(s=1e-5) 0.75 93.12 -1.3 597k 73.30% 2.3
16 mobilenet_v3_large_nose 0 92.61 0 2.7 M
18 mobilenet_v3_large_nose 0.5 93.33 +0.72 1.9 M 30.09%
17 mobilenet_v3_large_nose 0.75 91.42 -1.19 1.4 M 48.33%
19 mobilenet_v3_small_nose 0 90.69 0 1.1 M
20 mobilenet_v3_small_nose 0.5 91.08 +0.39 777 K 27.11%
21 mobilenet_v3_small_nose 0.75 87.25 -3.44 564 K 47.11%
22 mobilenet_v3_large 0 92.96 0 4.2 M 16 100ms 165mb
23 mobilenet_v3_large 0.75 92.18 -0.78 1.6 M 63.12% 5.9 65ms 151mb
24 mobilenet_v3_large 0.5 92.87 -0.09 2.3 M 45.57% 8.7 81ms 123mb
  • for RepVGG-A0-woid(prune ratio 0.7), fine tune learning rate = 0.001
  • woid:RepVGGBlock without identity layer
  • nose: MobileNetV3 without SE block

Experiments result without sparsity train + prune:

Net Sparsity Prune Ratio Test Acc Test Acc Diff Params Size Reduce
0 resnet18 0 0 93.65 0 11.2 M
1 resnet18 0 0.75 91.07 -2.58 389 K 96.52%

TODO: Understand why the size of the model is reduced so much?

How to use pns in your project

  1. Understand the content of this paper Learning Efficient Convolutional Networks Through Network Slimming,install pns by run python3 setup.py install
  2. Refer to the gen_schema.py script to generate the pruning schema. You may need to implement your own build_model section
  3. Training: call update_bn_grad after loss.backward()
...
loss.backward()
update_bn_grad(model, s=0.0001)
...
  1. Fine tune:

    • Restore model weights according to the normal flow of your project
    • Pruning according to the prune_schema of your network
    • Save pruning result anywhere you like
    • Fine tune pruned model
    pruner = SlimPruner(restored_model, prune_schema)
    pruning_result: List[Dict] = pruner.run(prune_ratio=0.75)
    # save pruning_result
    ...
    # fine tune pruned_model
    pruner.pruned_model
    ...
    # save pruned_model state_dict()
    torch.save(pruner.pruned_model.state_dict())
  2. Loading pruning result/params when do forward or pruning again:

#  build model
pruner = SlimPruner(model)
# load pruning_result from some where to get a slim network
pruning_result: List[Dict]
pruner.apply_pruning_result(pruning_result)
# load pruned state_dict from some where
pruned_state_dict: Dict
pruner.pruned_model.load_state_dict(pruned_state_dict)
# do forward or train with update_bn_grad again
pruner.pruned_model

Pruning Schema

{
  "prefix": "model.",
  "channel_rounding": "eight",
  "modules": [
    {
      "name": "conv1",
      "prev_bn": "",
      "next_bn": "bn1"
    }
  ],
  "shortcuts": [
    {
      "names": ["bn1", "layer1.0.bn2", "layer1.1.bn2"],
      "method": "or"
    }
  ],
  "depthwise_conv_adjacent_bn": [
    {
      "names": ["bn1", "layer1.0.bn2", "layer1.1.bn2"],
      "method": "or"
    }
  ],
  "fixed_bn_ratio": [
    {
      "name": "name1",
      "ratio": 0.8
    },
    {
      "name": ["name2"],
      "ratio": 0.8
    }
  ]
}
  • prefix: common prefix added to all module name
  • channel_rounding: none/eight/two_pow
  • modules: Conv2d or Linear layers
  • shortcuts/depthwise_conv_adjacent_bn: BatchNorm2d Layers
    • or: All bn layer reserved channels take the merged set
    • and: All bn layer reserved channels take the intersection set
  • fixed_bn_ratio: BatchNorm2d Layers fix prune percent, will applied before merge shortcuts
    • name: string or List[string]
    • ratio: prune ratio

Development

Run test

pytest -v tests

pytorch-network-slimming's People

Contributors

sanster avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pytorch-network-slimming's Issues

SlimPruner使用

今天对gen_schema.py进行了改写生成了需要的json文件,在看train代码的时候,前面使用pytorch lightning对模型进行一层封装,SlimPruner初始化的时候传入的参数是LitModel类型的实例,如果原始网络的代码是pytorch的,也要进行类似的操作吗?
restored_model = LitModel.load_from_checkpoint(last_model_path, args=args)

pruner = SlimPruner(restored_model, args.prune_schema)
pruning_result = pruner.run(args.prune_ratio)
在readme里面没有提到这个地方,所以有点疑问。
另外就是看pytorch lightning这块代码,因为对pytorch lightning不熟悉,这个在回调函数中设置的路径,会在训练的时候进行模型的保存吗?感觉有点不习惯。

python3 gen_schema.py --net resnet18 --save_path ./schema/resnet18.json 报错

Traceback (most recent call last):
File "gen_schema.py", line 31, in
y = model.features(x)
File "/home/xxx/anaconda3/envs/slim/lib/python3.8/site-packages/torch/nn/modules/module.py", line 778, in getattr
raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'ResNet' object has no attribute 'features'

AssertionError: len(in_keep_idxes): 2048, module.weight.shape[1]: 128

Traceback (most recent call last):
File "gen_schema.py", line 77, in
pruner.run(0.6)
File "/root/anaconda3/envs/slim/lib/python3.8/site-packages/pns-0.1.0-py3.8.egg/pns/pns.py", line 307, in run
conv2d.prune(
File "/root/anaconda3/envs/slim/lib/python3.8/site-packages/pns-0.1.0-py3.8.egg/pns/pns.py", line 50, in prune
self.in_channels_keep_idxes, self.out_channels_keep_idxes = prune_conv2d(
File "/root/anaconda3/envs/slim/lib/python3.8/site-packages/pns-0.1.0-py3.8.egg/pns/functional.py", line 75, in prune_conv2d
assert (
AssertionError: len(in_keep_idxes): 2048, module.weight.shape[1]: 128
报错如上。
模型结构:
model: Net(
(backbone): ResNetWrapper(
(model): ResNet(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(layer1): Sequential(
(0): Bottleneck(
(conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer2): Sequential(
(0): Bottleneck(
(conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(3): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer3): Sequential(
(0): Bottleneck(
(conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(3): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(4): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(5): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer4): Sequential(
(0): Bottleneck(
(conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(4, 4), dilation=(4, 4), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(4, 4), dilation=(4, 4), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
)
(out): Conv2d(2048, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
)
(xxx): xxx(
(conv_d0): Conv2d(128, 128, kernel_size=(1, 9), stride=(1, 1), padding=(0, 4), bias=False)
(conv_u0): Conv2d(128, 128, kernel_size=(1, 9), stride=(1, 1), padding=(0, 4), bias=False)
(conv_r0): Conv2d(128, 128, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0), bias=False)
(conv_l0): Conv2d(128, 128, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0), bias=False)
(conv_d1): Conv2d(128, 128, kernel_size=(1, 9), stride=(1, 1), padding=(0, 4), bias=False)
(conv_u1): Conv2d(128, 128, kernel_size=(1, 9), stride=(1, 1), padding=(0, 4), bias=False)
(conv_r1): Conv2d(128, 128, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0), bias=False)
(conv_l1): Conv2d(128, 128, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0), bias=False)
(conv_d2): Conv2d(128, 128, kernel_size=(1, 9), stride=(1, 1), padding=(0, 4), bias=False)
(conv_u2): Conv2d(128, 128, kernel_size=(1, 9), stride=(1, 1), padding=(0, 4), bias=False)
(conv_r2): Conv2d(128, 128, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0), bias=False)
(conv_l2): Conv2d(128, 128, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0), bias=False)
(conv_d3): Conv2d(128, 128, kernel_size=(1, 9), stride=(1, 1), padding=(0, 4), bias=False)
(conv_u3): Conv2d(128, 128, kernel_size=(1, 9), stride=(1, 1), padding=(0, 4), bias=False)
(conv_r3): Conv2d(128, 128, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0), bias=False)
(conv_l3): Conv2d(128, 128, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0), bias=False)
)
(decoder): BUSD(
(layers): ModuleList(
(0): UpsamplerBlock(
(conv): ConvTranspose2d(128, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
(follows): ModuleList(
(0): non_bottleneck_1d(
(conv3x1_1): Conv2d(64, 64, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0))
(conv1x3_1): Conv2d(64, 64, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1))
(bn1): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
(conv3x1_2): Conv2d(64, 64, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0))
(conv1x3_2): Conv2d(64, 64, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1))
(bn2): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
(dropout): Dropout2d(p=0, inplace=False)
)
(1): non_bottleneck_1d(
(conv3x1_1): Conv2d(64, 64, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0))
(conv1x3_1): Conv2d(64, 64, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1))
(bn1): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
(conv3x1_2): Conv2d(64, 64, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0))
(conv1x3_2): Conv2d(64, 64, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1))
(bn2): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
(dropout): Dropout2d(p=0, inplace=False)
)
)
(interpolate_conv): Conv2d(128, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(interpolate_bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(1): UpsamplerBlock(
(conv): ConvTranspose2d(64, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(bn): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
(follows): ModuleList(
(0): non_bottleneck_1d(
(conv3x1_1): Conv2d(32, 32, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0))
(conv1x3_1): Conv2d(32, 32, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1))
(bn1): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
(conv3x1_2): Conv2d(32, 32, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0))
(conv1x3_2): Conv2d(32, 32, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1))
(bn2): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
(dropout): Dropout2d(p=0, inplace=False)
)
(1): non_bottleneck_1d(
(conv3x1_1): Conv2d(32, 32, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0))
(conv1x3_1): Conv2d(32, 32, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1))
(bn1): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
(conv3x1_2): Conv2d(32, 32, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0))
(conv1x3_2): Conv2d(32, 32, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1))
(bn2): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
(dropout): Dropout2d(p=0, inplace=False)
)
)
(interpolate_conv): Conv2d(64, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
(interpolate_bn): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(2): UpsamplerBlock(
(conv): ConvTranspose2d(32, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(bn): BatchNorm2d(16, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
(follows): ModuleList(
(0): non_bottleneck_1d(
(conv3x1_1): Conv2d(16, 16, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0))
(conv1x3_1): Conv2d(16, 16, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1))
(bn1): BatchNorm2d(16, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
(conv3x1_2): Conv2d(16, 16, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0))
(conv1x3_2): Conv2d(16, 16, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1))
(bn2): BatchNorm2d(16, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
(dropout): Dropout2d(p=0, inplace=False)
)
(1): non_bottleneck_1d(
(conv3x1_1): Conv2d(16, 16, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0))
(conv1x3_1): Conv2d(16, 16, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1))
(bn1): BatchNorm2d(16, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
(conv3x1_2): Conv2d(16, 16, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0))
(conv1x3_2): Conv2d(16, 16, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1))
(bn2): BatchNorm2d(16, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
(dropout): Dropout2d(p=0, inplace=False)
)
)
(interpolate_conv): Conv2d(32, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)
(interpolate_bn): BatchNorm2d(16, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(output_conv): Conv2d(16, 6, kernel_size=(1, 1), stride=(1, 1), bias=False)
)
(heads): ExistHead(
(dropout): Dropout2d(p=0.1, inplace=False)
(conv8): Conv2d(128, 6, kernel_size=(1, 1), stride=(1, 1))
(fc9): Linear(in_features=5280, out_features=128, bias=True)
(fc10): Linear(in_features=128, out_features=5, bias=True)
)
)

Question about tracker

Hi!

Thanks for the excellent work!

I do like to ask, whether the tracker can support tracking the class which is created by inheit nn.Conv2D.
Currently, I try to prune the efficientNet, but efficientNet use a class which is derive as below:

class Conv2dDynamicSamePadding(nn.Conv2d):
    """ 2D Convolutions like TensorFlow, for a dynamic image size """

    def __init__(self, in_channels, out_channels, kernel_size, stride=1, dilation=1, groups=1, bias=True):
        super().__init__(in_channels, out_channels, kernel_size, stride, 0, dilation, groups, bias)
        self.stride = self.stride if len(self.stride) == 2 else [self.stride[0]] * 2

    def forward(self, x):
        ih, iw = x.size()[-2:]
        kh, kw = self.weight.size()[-2:]
        sh, sw = self.stride
        oh, ow = math.ceil(ih / sh), math.ceil(iw / sw)
        pad_h = max((oh - 1) * self.stride[0] + (kh - 1) * self.dilation[0] + 1 - ih, 0)
        pad_w = max((ow - 1) * self.stride[1] + (kw - 1) * self.dilation[1] + 1 - iw, 0)
        if pad_h > 0 or pad_w > 0:
            x = F.pad(x, [pad_w // 2, pad_w - pad_w // 2, pad_h // 2, pad_h - pad_h // 2])
        return F.conv2d(x, self.weight, self.bias, self.stride, self.padding, self.dilation, self.groups)

How can I make this layer been tracked? Or if you can provide me some insight?
Thanks!

KeyError: 'encoder.level1.conv'

Traceback (most recent call last):
File "prune.py", line 126, in
pruner = SlimPruner(model, "/home/jing/model/schema/model.json")
File "/home/jing/pytorch-network-slimming/src/pns/pns.py", line 187, in init
modules[name]["name"],
KeyError: 'encoder.level1.conv'
使用生成的json文件进行SlimPruner初始化的时候报错了,这个地方使用有啥注意的地方吗?
另外,
pruning_result = pruner.run(args.prune_ratio)
和pruner.pruned_model的区别是什么呢

can not pruning following network with two continuous convolution block

class StrangeNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3,256,kernel_size=3)
        self.bn = nn.BatchNorm2d(256,256)
        self.conv2_1 = nn.Conv2d(256,128  , kernel_size=3)
        self.conv2_2 = nn.Conv2d(128,128 , kernel_size=3 )
        self.bn2 = nn.BatchNorm2d(128 )

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn(x)
        x = self.conv2_1(x)
        x = self.conv2_2(x)
        x = self.bn2(x)
        return x

conv pruning issue

Thanks for your work, i have pruning my two-stage detection model faster rcnn with the resnet50+fpn backbone, while running the gen_schema.py, i met a conv pruned issue, becuase the model isn't the classification task, is there anything i need to change,
Traceback (most recent call last): File "gen_schema.py", line 86, in <module> pruner.run(0.6) File "model_compression/src/pns/pns.py", line 311, in run self.bn2d_modules[conv2d.next_bn_name] if conv2d.next_bn_name else None, File "model_compression/src/pns/pns.py", line 53, in prune next_bn.keep_idxes if next_bn else None, File "model_compression/src/pns/functional.py", line 97, in prune_conv2d module.bias = torch.nn.Parameter(module.bias.data[out_keep_idxes]) RuntimeError: CUDA error: device-side assert triggered
My email is [email protected] and my wechat is 13773273112.Looking forward to your reply, thanks!

示例代码,保存的模型大小有点疑问

使用示例代码训练resnet18保存到output目录的模型大小86M,如下:
-rw-rw-r-- 1 jing jing 86M 2月 11 06:26 epoch=119-train_loss=0.02-test_acc=0.905.ckpt
-rw-rw-r-- 1 jing jing 86M 2月 10 09:43 epoch=69-train_loss=0.00-test_acc=0.902.ckpt
-rw-rw-r-- 1 jing jing 81M 2月 10 10:05 events.out.tfevents.1612919576.image.105605.0
-rw-rw-r-- 1 jing jing 462 2月 10 10:05 events.out.tfevents.1612922725.image.105605.1
-rw-rw-r-- 1 jing jing 79M 2月 11 06:26 events.out.tfevents.1612992806.image.47582.0
-rw-rw-r-- 1 jing jing 462 2月 11 06:26 events.out.tfevents.1612996007.image.47582.1
-rw-rw-r-- 1 jing jing 450 2月 12 17:33 events.out.tfevents.1613122393.image.104338.0
-rw-rw-r-- 1 jing jing 450 2月 12 18:49 events.out.tfevents.1613126993.image.26640.0
-rw-rw-r-- 1 jing jing 450 2月 13 07:20 events.out.tfevents.1613172040.image.24783.0
-rw-rw-r-- 1 jing jing 86M 2月 11 06:26 last.ckpt
-rw-rw-r-- 1 jing jing 115 2月 13 07:20 metric.json
-rw-rw-r-- 1 jing jing 86M 2月 11 06:26 model_with_pruning_result.ckpt
drwxrwxr-x 2 jing jing 4.0K 2月 11 07:16 pruned_0.75
在pruned_0.75文件夹下,得到模型大小11M。
-rw-rw-r-- 1 jing jing 11M 2月 10 10:15 epoch=24-train_loss=0.01-test_acc=0.932.ckpt
-rw-rw-r-- 1 jing jing 11M 2月 11 06:47 epoch=49-train_loss=0.00-test_acc=0.935.ckpt
-rw-rw-r-- 1 jing jing 58K 2月 10 10:15 events.out.tfevents.1612922728.image.105605.2
-rw-rw-r-- 1 jing jing 261K 2月 11 07:16 events.out.tfevents.1612996011.image.47582.2
-rw-rw-r-- 1 jing jing 462 2月 11 07:16 events.out.tfevents.1612998965.image.47582.3
-rw-rw-r-- 1 jing jing 11M 2月 11 07:16 last.ckpt
-rw-rw-r-- 1 jing jing 116 2月 11 07:16 metric.json
readme列表resnet18模型大小11M,剪枝后4.5或者3.5,这个地方是什么原因呢 看代码感觉是把模型保存到一起了,没有单独保存剪枝后模型吗

AssertionError: len(in_keep_idxes): 16, module.weight.shape[1]: 1

Traceback (most recent call last):
File "gen_schema.py", line 78, in
pruner.run(0.1) # j
File "/home/xxx/zxz/pytorch-network-slimming/src/pns/pns.py", line 307, in run
conv2d.prune(
File "/home/xxx/zxz/pytorch-network-slimming/src/pns/pns.py", line 50, in prune
self.in_channels_keep_idxes, self.out_channels_keep_idxes = prune_conv2d(
File "/home/xxx/zxz/pytorch-network-slimming/src/pns/functional.py", line 75, in prune_conv2d
assert (
AssertionError: len(in_keep_idxes): 16, module.weight.shape[1]: 1

生成schema的json文件时,模型应该是加载了训练好的权重吧,我可以正常生成json文件,但是传入的x的batchsize不能是1,是1会报错,我设置成8就可以了,但是在下面代码中进行prune操作时,崩了,如上。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.