mit-han-lab / amc Goto Github PK

View Code? Open in Web Editor NEW

417.0 417.0 108.0 18 KB

[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices

Home Page: https://arxiv.org/abs/1802.03494

License: MIT License

Python 98.14% Shell 1.86%

automl automl-for-compression channel-pruning efficient-model model-compression on-device-ai

amc's People

Contributors

Stargazers

Watchers

Forkers

amc's Issues

support other networks?

Hello,
I already evaluated amc code using mobiilenet with imagenet and cifar10 dataset.
It works well.

I would like to explore other networks such as mobilenet_v2, resnet50, inception_v3 etc.
According to the paper, some of these models are already implemented.
When these models are available?

shared index of groups in mobilenetV2 residual block missing one line code

the code logic in the #371 line of channel_pruning_env.py is missing one line code: self.shared_idx.append(share_group) should be added after the for loop.

How does AMC validate the pruned model to get reward?

Hi, thanks for your great work!

However, I find that it always validate the original model under "train" mode.
In env.step(), it validates the model(line 123) to get the reward after prune_kernel() operation(line 97).
But the model will be pruned only under the "export" mode as line 249 to line 261.

So what's the problem?

Results on VGG16/ImageNet

Hello, thanks for your work.

I modified your code to prune 3x3 conv layers using im2col in order to prune VGG16. The result is, about 40% accuracy loss from baseline before re-training, and 3.7% loss after re-training. This is much worse than your result. I used the experimental settings written in the paper as far as possible, and for some settings that could not be found in the paper, I used the default settings. I also tried some other settings (--use_new_input, --n_calibration_batches=XX, and so on), all failed.

I realized that I pruned lots of channels in Conv5-1 and -2 layers (they are not very redundant and should not be pruned, according to your previous paper about channel pruning.) Therefore, one possibility for improvement is manually excluding those layers from the pruning targets. However, I think AMC should be able to do it automatically.

Do you have any idea how to reproduce the paper's result? For the codes of pruning 3x3 conv, please see the pull request. Is the implementation correct?

Is there a weight file for MobilenetV2?

Current code only supports 1x1 conv？

high compression ratio

I had tried to compress mobilenet v1 model /w visual wake word dataset, the compression ratio can not too high, e.g. 7 times (or 14% of original model size), the accuracy drop is obvious.

However, I compress the model by two stage, i.e., 1st 25% and 2nd 56%, the accuracy can be kept as high as original model and the total compression ratio could be 14%.

Does it mean that amc can't work well in high compression ratio? Or I misunderstand your algorithm? Do you have test high compression ratio case?

The linearRegression for resnet-50 is to slow and reward is low.

Hi all,
I use the im2col to implement the conv3*3 linearRegression, but the prune speed is slow, it takes about 2 days to finish 800 steps train. The reward is converage, but the acc reward is low (prune 50%) , the best acc is about 8%。 the resnet-18 is better acc highest reward is 23%. Is this phenomenon normal ?

Imagenet dataset ?

Hi, thanks for your great job.
I wonder to know how can I get the right imagenet dataset you used for strategy search on mobilenet?

How to search&export resnet and mobilenet v2?

How to search&export resnet and mobilenet v2? Can you give me some advice? Thank you very much.

Compress ProxlessNas

I'm trying to compress ProxlessNAS but met the following error.
where the model comes from
model = torch.hub.load('mit-han-lab/ProxylessNAS', "proxyless_cpu", pretrained=True)

Could you please give me comments to fix the problem?

Error message:
File "amc_search.py", line 262, in
train(args.train_episode, agent, env, args.output)
File "amc_search.py", line 142, in train
observation2, reward, done, info = env.step(action)
File "/home/work/amc_vww_proxless/env/channel_pruning_env.py", line 97, in step
action, d_prime, preserve_idx = self.prune_kernel(self.prunable_idx[self.cur_ind], action, preserve_idx)
File "/home/work/amc_vww_proxless/env/channel_pruning_env.py", line 228, in prune_kernel
masked_X = X[:, mask]
IndexError: boolean index did not match indexed array along dimension 1; dimension is 80 but corresponding boolean dimension is 40

Pruning for Object Detection tasks

Hey. I notice that on the imagenet task, the accuracy of a model after pruning is used to calculate reward. In object detection scenarios, what do you use to calculate that? mAP, loss?

Problems of action wall

Thank you for sharing the code.

In the env/channel_pruning_env.py, line 282 and 285, found conflicts between code and comments:

if i == self.cur_ind - 1:  # TODO: add other member in the set
  this_comp += flop * self.strategy_dict[idx][0]
  # add buffer (but not influenced by ratio)
  other_comp += buffer_flop * self.strategy_dict[idx][0]
elif i == self.cur_ind:
  this_comp += flop * self.strategy_dict[idx][1]
  # also add buffer here (influenced by ratio)
  this_comp += buffer_flop

Why you only consider input reduction of buffer flops?

Compress other model on CIFAR-10

Hello,
In the AMC paper, I see that the author also compressed Plain-20 and ResNet on CIFAR-10 dataset. Currently, I want to compress Alexnet on CIFAR-10. So, can I base on this code to compress Alexnet? And if it's possible, so what portion of code or variables are needed to modified, etc

MobileNet V2 or other residual model

Hello,
I wonder that when the mobilenet v2 is supported.
If that's hard to do for now, could you please explain the details?
For example, what portion of code or variables are needed to modified, etc.

Thank you

How to measure inference time

Hi, thanks for your great work.

I want to compare the inference time after compressing MobileNet. I measured with the batch size set to 1 with the code snippet below, is this the correct way? (eval_mobilenet.py)

start, end = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)
start.record()
outputs = net(inputs) # forward
end.record()
torch.cuda.synchronize()
curr_time = start.elapsed_time(end)

When measured, the inference time of MobileNet before and after compression was about 2.9 ms. This is average of curr_time. The value is much larger than the inference time of the paper(0.4 ms), so I want to know how the inference time was measured.

thank you.

Can the backbone pruned from this code be used for target detection tasks? Or does it mean that this method can compress the task of target detection?

Can the backbone pruned from this code be used for target detection tasks? Or does it mean that this method can compress the task of target detection? @tonylins

download.sh is not found.

Thank you for sharing the code. In the /checkpoints，I didn't find download.sh, Can you give me some information? And, when i run ./scripts/search_mobilenetv2_0.7flops.sh, a mistake was encountered, as follows:
~/AMC$ ./scripts/search_mobilenetv2_0.7flops.sh
=> Preparing data: imagenet...
=> Conv layers to share channels: [[4, 6], [8, 10, 12], [14, 16, 18, 20], [22, 24, 26]]
=> Prunable layer idx: [3, 11, 15, 21, 25, 31, 35, 41, 45, 51, 55, 61, 65, 71, 75, 81, 85, 91, 95, 101, 105, 111, 115, 121, 125, 131, 135, 141, 145, 151, 155, 161, 165, 171, 174, 179]
=> Buffer layer idx: [8, 18, 28, 38, 48, 58, 68, 78, 88, 98, 108, 118, 128, 138, 148, 158, 168]
=> Initial min strategy dict: {3: [1, 0.2], 11: [0.2, 0.2], 15: [0.2, 0.2], 21: [0.2, 0.2], 25: [0.2, 0.2], 31: [0.2, 0.2], 35: [0.2, 0.2], 41: [0.2, 0.2], 45: [0.2, 0.2], 51: [0.2, 0.2], 55: [0.2, 0.2], 61: [0.2, 0.2], 65: [0.2, 0.2], 71: [0.2, 0.2], 75: [0.2, 0.2], 81: [0.2, 0.2], 85: [0.2, 0.2], 91: [0.2, 0.2], 95: [0.2, 0.2], 101: [0.2, 0.2], 105: [0.2, 0.2], 111: [0.2, 0.2], 115: [0.2, 0.2], 121: [0.2, 0.2], 125: [0.2, 0.2], 131: [0.2, 0.2], 135: [0.2, 0.2], 141: [0.2, 0.2], 145: [0.2, 0.2], 151: [0.2, 0.2], 155: [0.2, 0.2], 161: [0.2, 0.2], 165: [0.2, 0.2], 171: [0.2, 0.2], 174: [0.2, 0.2], 179: [0.2, 1]}
=> Extracting information...
=> shape of embedding (n_layer * n_dim): (36, 10)
=> original acc: 97.067%
=> original weight size: 3.4708 M param
=> FLOPs:
[10.838016, 3.612672, 6.422528, 19.267584, 2.709504, 7.225344, 10.838016, 4.064256, 10.838016, 10.838016, 1.016064, 3.612672, 4.816896, 1.354752, 4.816896, 4.816896, 1.354752, 4.816896, 4.816896, 0.338688, 2.408448, 4.816896, 0.677376, 4.816896, 4.816896, 0.677376, 4.816896, 4.816896, 0.677376, 4.816896, 4.816896, 0.677376, 7.225344, 10.838016, 1.016064, 10.838016, 10.838016, 1.016064, 10.838016, 10.838016, 0.254016, 4.51584, 7.5264, 0.42336, 7.5264, 7.5264, 0.42336, 7.5264, 7.5264, 0.42336, 15.0528, 20.0704, 1.281]
=> original FLOPs: 300.7753 M
=> Saving logs to ./logs/mobilenetv2_imagenet_r0.7_search-run1
=> Output path: ./logs/mobilenetv2_imagenet_r0.7_search-run1...
** Actual replay buffer size: 3600
Traceback (most recent call last):
File "amc_search.py", line 242, in
train(args.train_episode, agent, env, args.output)
File "amc_search.py", line 122, in train
observation2, reward, done, info = env.step(action)
File "/home/wangzhaoming/AMC/env/channel_pruning_env.py", line 152, in step
self.layer_embedding[self.cur_ind][-2] = sum(self.flops_list[self.cur_ind + 1:]) * 1. / self.org_flops # rest
AttributeError: 'ChannelPruningEnv' object has no attribute 'flops_list'
Can it be solved? Thank you very much.

The linearRegression.fit() is too slow.

amc/lib/utils.py

Line 130 in 040d83f

reg.fit(X, Y)

How could I resume from *.pkls?

Frist, thank you for your brilliant work!
I am wondering that after 2400 episodes of training, how could I reuse the RL model? The --resume argument doesn't work. The load_weights function in agent.py seems have no apperant effect on the accuracy in the begining of training. Is the well-trained RL model useless to other pruning task?