Giter Club home page Giter Club logo

cnn_architectures_image_classification's Introduction

CNN Architectures for Image Classification

Packagist

Author

Arpit Aggarwal

Introduction to the Project

In this project, different CNN Architectures like AlexNet, VGG-16, VGG-19, InceptionNet, DenseNet-121 and ResNet-50 were used for the task of Dog-Cat image classification. The input to the CNN networks was a (224 x 224 x 3) image and the number of classes were 2, where '0' was for a cat and '1' was for a dog. The CNN architectures were implemented in PyTorch and the loss function was Cross Entropy Loss. The hyperparameters to be tuned were: Number of epochs(e), Learning Rate(lr), momentum(m), weight decay(wd) and batch size(bs).

Data

The data for the task of Dog-Cat image classification can be downloaded from: https://drive.google.com/drive/folders/1EdVqRCT1NSYT6Ge-SvAIu7R5i9Og2tiO?usp=sharing. The dataset has been divided into three sets: Training data, Validation data and Testing data. The analysis of different CNN architectures for Dog-Cat image classification was done on comparing the Training Accuracy and Validation Accuracy values.

Methods for preventing overfitting

The methods that were applied to prevent overfitting were: Label smoothing, Weight Decay, Data Augmentation and Dropout. To learn more about label smoothing regularization technique, refer to the link provided in the Credits section.

Results

The results after using different CNN architectures are given below:

  1. AlexNet

Not Pre-trained: Training Accuracy = 96.64% and Validation Accuracy = 93.59% (e = 50, lr = 0.005, m = 0.9, bs = 64, wd = 0.001)

Pre-trained: Training Accuracy = 98.01% and Validation Accuracy = 96.63% (e = 50, lr = 0.005, m = 0.9, bs = 64, wd = 0.001)

  1. VGG-16

Not Pre-trained: Training Accuracy = 98.32% and Validation Accuracy = 97.13% (e = 100, lr = 0.005, m = 0.9, bs = 32, wd = 5e-4)

Pre-trained: Training Accuracy = 99.22% and Validation Accuracy = 98.58% (e = 30, lr = 0.001, m = 0.9, bs = 32, wd = 5e-4)

  1. VGG-19

Not Pre-trained: Training Accuracy = 98.71% and Validation Accuracy = 97.05% (e = 100, lr = 0.005, m = 0.9, bs = 32, wd = 5e-4)

Pre-trained: Training Accuracy = 99.15% and Validation Accuracy = 98.53% (e = 30, lr = 0.001, m = 0.9, bs = 32, wd = 5e-4)

  1. DenseNet-121

Not Pre-trained: Training Accuracy = 99.2% and Validation Accuracy = 92.9% (e = 60, lr = 0.003, m = 0.9, bs = 32, wd = 0.001)

Pre-trained: Training Accuracy = 98.73% and Validation Accuracy = 98.01% (e = 40, lr = 0.01, m = 0.9, bs = 32, wd = 5e-4)

  1. InceptionNet

Not Pre-trained: Training Accuracy = 98.1% and Validation Accuracy = 93.8% (e = 60, lr = 0.003, m = 0.9, bs = 32, wd = 0.001)

Pre-trained: Training Accuracy = 99.94% and Validation Accuracy = 99.13% (e = 40, lr = 0.005, m = 0.9, bs = 32, wd = 5e-4)

  1. ResNet-50

Not Pre-trained: Training Accuracy = 96.33% and Validation Accuracy = 92.72% (e = 100, lr = 1e-3, m = 0.9, bs = 32, wd = 5e-4)

Pre-trained: Training Accuracy = 99.43% and Validation Accuracy = 98.43% (e = 30, lr = 0.005, m = 0.9, bs = 32, wd = 5e-4)

Software Required

To run the jupyter notebooks, use Python 3. Standard libraries like Numpy and PyTorch are used.

Credits

The following links were helpful for this project:

  1. https://www.youtube.com/channel/UC88RC_4egFjV9jfjBHwDuvg
  2. https://github.com/pytorch/tutorials
  3. https://towardsdatascience.com/what-is-label-smoothing-108debd7ef06
  4. https://leimao.github.io/blog/Label-Smoothing/

cnn_architectures_image_classification's People

Contributors

arp95 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

cnn_architectures_image_classification's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.