Giter Club home page Giter Club logo

carnet-lightweight-resnet's Introduction

Introduction

CARNet is a modified version of ResNet. The modified code is based on Resnet18. By designing a lightweight attention module to recalibrate features, CARNet reduces 94.2% parameters (from 11.2M to 0.65M), 95% computations (from 32.6 GFLOPs to 1.63 GFLOPs), and 94.2% model size (from 618MB to 360MB) compared to original ResNet, while maintaining comparable accuracy on the same dataset.

Good Preference for Limited Resources

In previous experiments, training the standard ResNet18 on the dataset repeatedly resulted in top-1 accuracies consistently above 77.73%, averaging 79.14% after 50 epochs. While ResNet displayed excellent generalization and high accuracy, its large model size demanded substantial computational resources, making training challenging without high-end hardware. Initial attempts to train ResNet on a laptop with 24GB RAM and an 8GB RTX 4060 GPU were unsuccessful due to its high computational demand, causing multiple crashes. We then switched to a more powerful desktop with 32GB RAM and an 11GB RTX 2080Ti GPU, which still experienced high utilization rates above 85%.

In contrast, our modified CRANet, with reduced parameters and computational needs, trained effortlessly on the same laptop setup, with GPU usage dropping to around 80%. On a workstation with a 2080Ti GPU, CRANet training for 50 epochs took under 1.7 hours and achieved a stable top-1 accuracy of over 81%, averaging 81.8%, surpassing traditional ResNet. These findings demonstrate CRANet's ability to deliver comparable accuracy with significantly lower computational requirements and training time, offering a major advantage for deployment on edge devices with limited resources.

Fig1

How to use

The image dataset is from this Kaggle link: Kaggle Dataset

  • SportsClassify.py -- The traditional ResNet18 was used to classify the dataset
  • CARNet.py -- Using CARNet to classify the dataset
  • ResNet18.txt -- Detailed information of the ResNet18 model (various layers, params, sizes)
  • CARNet.txt -- Detailed information of the CARNet model (various layers, params, sizes)

Why to propose CARNet?

In the original ResNet architecture, the BasicBlock module is used as the fundamental building unit of the residual network. However, we identify two issues with the BasicBlock design:

  1. The convolutional layers in BasicBlock lead to a large number of parameters, increasing model complexity.
  2. The simplicity of the BasicBlock structure limits the representation power of ResNet for complex tasks.

To overcome these limitations, we propose to substitute the BasicBlock with a customized module named GroupConvBlock in our improved ResNet. As shown in Fig2, the GroupConvBlock contains the following components:

Fig2

  • Grouped Convolution Layer (conv1): Instead of regular convolution, we adopt grouped convolution in conv1, which divides the input channels into groups and performs convolution only within each group. This design significantly reduces parameter size as the connection between channels is sparse. For example, with 256 input channels divided into 64 groups, the parameter size is decreased to 1/64 of the original.
  • Batch Normalization (bn1): This is applied after conv1 to normalize activations and stabilize training.
  • Attention Module (attn): We incorporate a channel-wise attention module, which emphasizes informative features by multiplying the input with an attention mask. The mask is computed by global average pooling to capture contextual information, followed by two convolutional layers to generate the attention weights through a sigmoid activation.
  • 1x1 Convolution (conv2): At the end of GroupConvBlock, a 1x1 convolution adjusts channel size to match residual connections. It involves minimal parameters due to 1x1 kernels.

carnet-lightweight-resnet's People

Contributors

fai-yong avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.