Giter Club home page Giter Club logo

net_rl2's Introduction

Deeply Shared Filter Bases for Parameter-Efficient Convolutional Neural Networks

This repository is the official implementation of "Deeply Shared Filter Bases for Parameter-Efficient Convolutional Neural Networks" that is under submission to NeurIPS 2021.

In this paper, we present a recursive convolution block design and training method, in which a recursively shareable part, or a filter basis, is separated and learned while effectively avoiding the vanishing/exploding gradients problem during training.

Requirements

We conducted experiments under

  • python 3.8
  • pytorch 1.8, torchvision 0.9, cuda11

To install requirements:

pip3 install -r requirements.txt

Training

To train ResNet34-S48U1 in the paper on ILSVRC2012, run this command:

python3 train_ilsvrc.py --lr=0.1 --momentum=0.9 --weight_decay=1e-4 --lambdaR=10 --shared_rank=48 --unique_rank=1 --batch_size=256 --dataset_path=<path_to_dataset> --model=ResNet34_DoubleShared --visible_device=0,1,2,3

To train ResNet50-Shared in the paper on ILSVRC2012, run this command:

python3 train_ilsvrc.py --lr=0.1 --momentum=0.9 --weight_decay=4e-5 --lambdaR=10 --batch_size=256 --dataset_path=<path_to_dataset> --model=ResNet50_SharedSingle --visible_device=0,1,2,3

To train ResNet50-Shared++ in the paper on ILSVRC2012, run this command:

python3 train_ilsvrc.py --lr=0.1 --momentum=0.9 --weight_decay=1e-4 --lambdaR=10 --batch_size=256 --dataset_path=<path_to_dataset> --model=ResNet50_Shared --visible_device=0,1,2,3

To train ResNet101-Shared in the paper on ILSVRC2012, run this command:

python3 train_ilsvrc.py --lr=0.1 --momentum=0.9 --weight_decay=1e-4 --lambdaR=10 --batch_size=256 --dataset_path=<path_to_dataset> --model=ResNet101_SharedSingle --visible_device=0,1,2,3

To train MobileNetV2-Shared in the paper on ILSVRC2012, run this command:

python3 train_ilsvrc.py --lr=0.1 --momentum=0.9 --weight_decay=1e-5 --lambdaR=10 --batch_size=256 --dataset_path=<path_to_dataset> --model=MobileNetV2_Shared --visible_device=0,1,2,3

To train MobileNetV2-Shared++ in the paper on ILSVRC2012, run this command:

python3 train_ilsvrc.py --lr=0.1 --momentum=0.9 --weight_decay=1e-5 --lambdaR=10 --batch_size=256 --dataset_path=<path_to_dataset> --model=MobileNetV2_SharedDouble visible_device=0,1,2,3

Evaluation

To evaluate ResNet34-S48U1 in the paper on ILSVRC2012, run:

python3 eval_ilsvrc.py --pretrained=<path_to_model> --model=ResNet34_DoubleShared --shared_rank=48 --unique_rank=1 --dataset_path=<path_to_dataset> --visible_device=0,1,2,3

To evaluate ResNet50-Shared in the paper on ILSVRC2012, run:

python3 eval_ilsvrc.py --pretrained=<path_to_model> --model=ResNet50_SharedSingle --dataset_path=<path_to_dataset> --visible_device=0,1,2,3

To evaluate ResNet50-Shared++ in the paper on ILSVRC2012, run:

python3 eval_ilsvrc.py --pretrained=<path_to_model> --model=ResNet50_Shared --dataset_path=<path_to_dataset> --visible_device=0,1,2,3

To evaluate ResNet101-Shared in the paper on ILSVRC2012, run:

python3 eval_ilsvrc.py --pretrained=<path_to_model> --model=ResNet101_SharedSingle --dataset_path=<path_to_dataset> --visible_device=0,1,2,3

To evaluate proposed MobileNetV2_Shared model in the paper on ILSVRC2012, run:

python3 eval_ilsvrc.py --pretrained=<path_to_model> --model=MobileNetV2_Shared --dataset_path=<path_to_dataset> --visible_device=0,1,2,3

To evaluate proposed MobileNetV2_Shared++ model in the paper on ILSVRC2012, run:

python3 eval_ilsvrc.py --pretrained=<path_to_model> --model=MobileNetV2_SharedDouble --dataset_path=<path_to_dataset> --visible_device=0,1,2,3

Results and Pretrained models

Our model achieves the following performance on :

ILSVRC2012 Classifcation

Model name Top 1 Error Top 5 Error Params FLOPs
ResNet34-S48U1* 26.67% 8.54% 11.79M 3.26G Download
ResNet50-Shared 23.64% 6.98% 20.51M 4.11G Download
ResNet50-Shared++ 23.95% 7.14% 18.26M 4.11G Download
ResNet101-Shared 22.31% 6.47% 29.47M 7.83G Download
MobileNetV2-Shared 27.61% 9.34% 3.24M 0.33G Download
MobileNetV2-Shared++ 28.21% 9.85% 2.98M 0.33G Download

Contributing

There is no way to contribute to the code yet - however, this is subject to be changed.

net_rl2's People

Contributors

ssregibility avatar wchkang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.