Giter Club home page Giter Club logo

bag_of_tricks_for_image_classification_with_convolutional_neural_networks's Introduction

Bag of Tricks for Image Classification with Convolutional Neural Networks

This repo was inspired by Paper Bag of Tricks for Image Classification with Convolutional Neural Networks

I would test popular training tricks as many as I can for improving image classification accuarcy, feel free to leave a comment about the tricks you want me to test(please write the referenced paper along with the tricks)

hardware

Using 4 Tesla P40 to run the experiments

dataset

I will use CUB_200_2011 dataset instead of ImageNet, just for simplicity, this is a fine-grained image classification dataset, which contains 200 birds categlories, 5K+ training images, and 5K+ test images.The state of the art acc on vgg16 is around 73%(please correct me if I was wrong).You could easily change it to the ones you like: Stanford Dogs, Stanford Cars. Or even ImageNet.

network

Use a VGG16 network to test my tricks, also for simplicity reasons, since VGG16 is easy to implement. I'm considering switch to AlexNet, to see how powerful these tricks are.

tricks

tricks I've tested, some of them were from the Paper Bag of Tricks for Image Classification with Convolutional Neural Networks :

trick referenced paper
xavier init Understanding the difficulty of training deep feedforward neural networks
warmup training Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
no bias decay Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes
label smoothing Rethinking the inception architecture for computer vision)
random erasing Random Erasing Data Augmentation
cutout Improved Regularization of Convolutional Neural Networks with Cutout
linear scaling learning rate Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
cosine learning rate decay SGDR: Stochastic Gradient Descent with Warm Restarts

and more to come......

result

baseline(training from sctrach, no ImageNet pretrain weights are used):

vgg16 64.60% on CUB_200_2011 dataset, lr=0.01, batchsize=64

effects of stacking tricks

trick acc
baseline 64.60%
+xavier init and warmup training 66.07%
+no bias decay 70.14%
+label smoothing 71.20%
+random erasing does not work, drops about 4 points
+linear scaling learning rate(batchsize 256, lr 0.04) 71.21%
+cutout does not work, drops about 1 point
+cosine learning rate decay does not work, drops about 1 point

bag_of_tricks_for_image_classification_with_convolutional_neural_networks's People

Contributors

weiaicunzai avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.