Deeply Shared Filter Bases for Parameter-Efficient Convolutional Neural Networks

This repository is the official implementation of "Deeply Shared Filter Bases for Parameter-Efficient Convolutional Neural Networks" that is under submission to NeurIPS 2021.

In this paper, we present a recursive convolution block design and training method, in which a recursively shareable part, or a filter basis, is separated and learned while effectively avoiding the vanishing/exploding gradients problem during training.

Requirements

We conducted experiments under

python 3.8
pytorch 1.8, torchvision 0.9, cuda11

To install requirements:

pip3 install -r requirements.txt

Training

To train ResNet34-S48U1 in the paper on ILSVRC2012, run this command:

python3 train_ilsvrc.py --lr=0.1 --momentum=0.9 --weight_decay=1e-4 --lambdaR=10 --shared_rank=48 --unique_rank=1 --batch_size=256 --dataset_path=<path_to_dataset> --model=ResNet34_DoubleShared --visible_device=0,1,2,3

To train ResNet50-Shared in the paper on ILSVRC2012, run this command:

python3 train_ilsvrc.py --lr=0.1 --momentum=0.9 --weight_decay=4e-5 --lambdaR=10 --batch_size=256 --dataset_path=<path_to_dataset> --model=ResNet50_SharedSingle --visible_device=0,1,2,3

To train ResNet50-Shared++ in the paper on ILSVRC2012, run this command:

python3 train_ilsvrc.py --lr=0.1 --momentum=0.9 --weight_decay=1e-4 --lambdaR=10 --batch_size=256 --dataset_path=<path_to_dataset> --model=ResNet50_Shared --visible_device=0,1,2,3

To train ResNet101-Shared in the paper on ILSVRC2012, run this command:

python3 train_ilsvrc.py --lr=0.1 --momentum=0.9 --weight_decay=1e-4 --lambdaR=10 --batch_size=256 --dataset_path=<path_to_dataset> --model=ResNet101_SharedSingle --visible_device=0,1,2,3

To train MobileNetV2-Shared in the paper on ILSVRC2012, run this command:

python3 train_ilsvrc.py --lr=0.1 --momentum=0.9 --weight_decay=1e-5 --lambdaR=10 --batch_size=256 --dataset_path=<path_to_dataset> --model=MobileNetV2_Shared --visible_device=0,1,2,3

To train MobileNetV2-Shared++ in the paper on ILSVRC2012, run this command:

python3 train_ilsvrc.py --lr=0.1 --momentum=0.9 --weight_decay=1e-5 --lambdaR=10 --batch_size=256 --dataset_path=<path_to_dataset> --model=MobileNetV2_SharedDouble visible_device=0,1,2,3

Evaluation

To evaluate ResNet34-S48U1 in the paper on ILSVRC2012, run:

python3 eval_ilsvrc.py --pretrained=<path_to_model> --model=ResNet34_DoubleShared --shared_rank=48 --unique_rank=1 --dataset_path=<path_to_dataset> --visible_device=0,1,2,3

To evaluate ResNet50-Shared in the paper on ILSVRC2012, run:

python3 eval_ilsvrc.py --pretrained=<path_to_model> --model=ResNet50_SharedSingle --dataset_path=<path_to_dataset> --visible_device=0,1,2,3

To evaluate ResNet50-Shared++ in the paper on ILSVRC2012, run:

python3 eval_ilsvrc.py --pretrained=<path_to_model> --model=ResNet50_Shared --dataset_path=<path_to_dataset> --visible_device=0,1,2,3

To evaluate ResNet101-Shared in the paper on ILSVRC2012, run:

python3 eval_ilsvrc.py --pretrained=<path_to_model> --model=ResNet101_SharedSingle --dataset_path=<path_to_dataset> --visible_device=0,1,2,3

To evaluate proposed MobileNetV2_Shared model in the paper on ILSVRC2012, run:

python3 eval_ilsvrc.py --pretrained=<path_to_model> --model=MobileNetV2_Shared --dataset_path=<path_to_dataset> --visible_device=0,1,2,3

To evaluate proposed MobileNetV2_Shared++ model in the paper on ILSVRC2012, run:

python3 eval_ilsvrc.py --pretrained=<path_to_model> --model=MobileNetV2_SharedDouble --dataset_path=<path_to_dataset> --visible_device=0,1,2,3

Results and Pretrained models

Our model achieves the following performance on :

ILSVRC2012 Classifcation

Model name	Top 1 Error	Top 5 Error	Params	FLOPs
ResNet34-S48U1*	26.67%	8.54%	11.79M	3.26G	Download
ResNet50-Shared	23.64%	6.98%	20.51M	4.11G	Download
ResNet50-Shared++	23.95%	7.14%	18.26M	4.11G	Download
ResNet101-Shared	22.31%	6.47%	29.47M	7.83G	Download
MobileNetV2-Shared	27.61%	9.34%	3.24M	0.33G	Download
MobileNetV2-Shared++	28.21%	9.85%	2.98M	0.33G	Download

Contributing

There is no way to contribute to the code yet - however, this is subject to be changed.

jhyuuu / net_rl2 Goto Github PK

net_rl2's Introduction

Deeply Shared Filter Bases for Parameter-Efficient Convolutional Neural Networks

Requirements

Training

Evaluation

Results and Pretrained models

ILSVRC2012 Classifcation

Contributing

net_rl2's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent