Giter Club home page Giter Club logo

cifar-10-cnn's Introduction

Convolutional Neural Networks for CIFAR-10

This repository is about some implementations of CNN Architecture for cifar10.

cifar10

I just use Keras and Tensorflow to implementate all of these CNN models.
(maybe torch/pytorch version if I have time)
A pytorch version is available at CIFAR-ZOO

Requirements

  • Python (3.5)
  • keras (>= 2.1.5)
  • tensorflow-gpu (>= 1.4.1)

Architectures and papers

Documents & tutorials

There are also some documents and tutorials in doc & issues/3.
Get it if you need.
You can also see the articles if you can speak Chinese.

Accuracy of all my implementations

In particular
Change the batch size according to your GPU's memory.
Modify the learning rate schedule may imporve the results of accuracy!

network GPU params batch size epoch training time accuracy(%)
Lecun-Network GTX1080TI 62k 128 200 30 min 76.23
Network-in-Network GTX1080TI 0.97M 128 200 1 h 40 min 91.63
Vgg19-Network GTX1080TI 39M 128 200 1 h 53 min 93.53
Residual-Network20 GTX1080TI 0.27M 128 200 44 min 91.82
Residual-Network32 GTX1080TI 0.47M 128 200 1 h 7 min 92.68
Residual-Network110 GTX1080TI 1.7M 128 200 3 h 38 min 93.93
Wide-resnet 16x8 GTX1080TI 11.3M 128 200 4 h 55 min 95.13
Wide-resnet 28x10 GTX1080TI 36.5M 128 200 10 h 22 min 95.78
DenseNet-100x12 GTX1080TI 0.85M 64 250 17 h 20 min 94.91
DenseNet-100x24 GTX1080TI 3.3M 64 250 22 h 27 min 95.30
DenseNet-160x24 1080 x 2 7.5M 64 250 50 h 20 min 95.90
ResNeXt-4x64d GTX1080TI 20M 120 250 21 h 3 min 95.19
SENet(ResNeXt-4x64d) GTX1080TI 20M 120 250 21 h 57 min 95.60

About LeNet and CNN training tips/tricks

LeNet is the first CNN network proposed by LeCun.
I used different CNN training tricks to show you how to train your model efficiently.

LeNet_keras.py is the baseline of LeNet,
LeNet_dp_keras.py used the Data Prepossessing [DP],
LeNet_dp_da_keras.py used both DP and the Data Augmentation[DA],
LeNet_dp_da_wd_keras.py used DP, DA and Weight Decay [WD]

network GPU DP DA WD training time accuracy(%)
LeNet_keras GTX1080TI - - - 5 min 58.48
LeNet_dp_keras GTX1080TI - - 5 min 60.41
LeNet_dp_da_keras GTX1080TI - 26 min 75.06
LeNet_dp_da_wd_keras GTX1080TI 26 min 76.23

For more CNN training tricks, see Must Know Tips/Tricks in Deep Neural Networks (by Xiu-Shen Wei)

About Learning Rate schedule

Different learning rate schedule may get different training/testing accuracy!
See ./htd, and HTD for more details.

Since the latest version of Keras is already supported keras.utils.multi_gpu_model, so you can simply use the following code to train your model with multiple GPUs:

from keras.utils import multi_gpu_model
from keras.applications.resnet50 import ResNet50

model = ResNet50()

# Replicates `model` on 8 GPUs.
parallel_model = multi_gpu_model(model, gpus=8)
parallel_model.compile(loss='categorical_crossentropy',optimizer='adam')

# This `fit` call will be distributed on 8 GPUs.
# Since the batch size is 256, each GPU will process 32 samples.
parallel_model.fit(x, y, epochs=20, batch_size=256)

About ResNeXt & DenseNet

Since I don't have enough machines to train the larger networks, I only trained the smallest network described in the paper. You can see the results in liuzhuang13/DenseNet and prlz77/ResNeXt.pytorch

   

Please feel free to contact me if you have any questions!

Citation

@misc{bigballon2017cifar10cnn,
  author = {Wei Li},
  title = {cifar-10-cnn: Play deep learning with CIFAR datasets},
  howpublished = {\url{https://github.com/BIGBALLON/cifar-10-cnn}},
  year = {2017}
}

cifar-10-cnn's People

Contributors

bigballon avatar chenhaozou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cifar-10-cnn's Issues

Something wrong in the SENET

There are someting wrong happened in the SENET when I copy the code and train it in my cpu. In the first Epoch 1/250, the begaining the loss is decrease and the accurancy increaing, yet, when the iterations came to 464/781, the loss became to "nan', I don't know what happened.
the config:
cardinality = 4
batch_size = 64
iterations = 781

Please help me.

in densenet, after "transition layer", the variable "nchannels" is not updated.

suppose x has 24 channels, nblocks=5, nchannels=12, then

    x, nchannels = dense_block(x, nblocks, nchannels) # nchannels = 24+12*5 = 84, ok
    x = transition(x, nchannels) # x's channels is 84/2=42, ok
    x, nchannels = dense_block(x, nblocks, nchannels) #nchannels = 84+12*5 = 144, not ok. should be 42+12*5= 102
    x = transition(x, nchannels) #x's channels is 144/2=72, not ok? should be 102/2=51

About DenseNet_keras.py

I have read the DenseNet pater, I think the last TransitionLayer is not needed in your implementation.

See the paper, the TransitionLayer is 1 less then the number of blocks, because the origin paper has 4 blocks, so the number of additional layer is
FirstLayer(1) + TransitionLayer(3) + LastLayer(1) = 5

In the implementation, it has 3 dense block, so the number of additional layer is
FirstLayer(1) + TransitionLayer(2) + LastLayer(1) = 4

What do you think? thank you!

i canot get your result

i get accuracy is 10%, if i just modify the x_Train by the way:
mean = [125.307, 122.95, 113.865]
std = [62.9932, 62.0087, 66.7048]
for i in range(3):
x_train[:,:,i] = ( x_train[:,:,i] - mean[i])/std[i]
x_test[:,:,i] = (x_test[:,:, i] - mean[i])/std[i]
but i get accuracy 52%,if i modify the x_Train by the way:
x_train /= 255
x_test /= 255

i donnot know why i cannot get the same result with you?
please help.thx.

my code is :
import keras
from keras import optimizers
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Conv2D,Dense, Flatten, MaxPooling2D
from keras.callbacks import LearningRateScheduler, TensorBoard

batch_size = 128
epochs = 10
iteration = 391
num_classes = 10
log_filepath = './lenet'

##kernel_initializer:?????
def build_model():
model = Sequential()
model.add(Conv2D(6, (5,5), padding = 'valid', activation = 'relu', kernel_initializer = 'he_normal', input_shape = (32, 32, 3)))
model.add(MaxPooling2D((2,2),strides = (2,2)))
model.add(Conv2D(16, (5,5), padding = 'valid', activation = 'relu', kernel_initializer = 'he_normal'))
model.add(MaxPooling2D((2,2), strides = (2,2)))
model.add(Flatten())
model.add(Dense(120, activation = 'relu', kernel_initializer = 'he_normal'))
model.add(Dense(84, activation = 'relu', kernel_initializer = 'he_normal'))
model.add(Dense(num_classes, activation = 'softmax', kernel_initializer = 'he_normal'))

sgd = optimizers.SGD(lr = 0.1, momentum = 0.9, nesterov = True)
model.compile(loss = 'categorical_crossentropy', optimizer = sgd, metrics = ['accuracy'])

return model

def scheduler(epoch):
learning_rate_init = 0.02
if epoch >= 80:
learning_rate_init = 0.01
if epoch >= 150:
learning_rate_init = 0.004
return learning_rate_init

if name == 'main':
(x_train, y_train), (x_test, y_test) = cifar10.load_data() ## values ???
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
# x_train /= 255
# x_test /= 255
mean = [125.307, 122.95, 113.865]
std = [62.9932, 62.0087, 66.7048]
for i in range(3):
	x_train[:,:,i] = ( x_train[:,:,i] - mean[i])/std[i]
	x_test[:,:,i] = (x_test[:,:, i] - mean[i])/std[i]
model = build_model()
print(model.summary())

tb_cb = TensorBoard(log_dir = log_filepath, histogram_freq = 0)
change_lr = LearningRateScheduler(scheduler)
cbks = [tb_cb, change_lr]

model.fit(x_train, y_train, batch_size = batch_size, epochs = epochs, callbacks = cbks, validation_data = (x_test, y_test), shuffle = True)

model.save('lenet.h5')

Unknown layer:layers

keras.models.load_model('vgg19_cifar10.h5')
but have an error:
D:\Users\Python\python3.exe D:/pythonwork/Adversarial/Cifar10_keras_Adversarial/MisClassification.py
Using TensorFlow backend.
Traceback (most recent call last):
File "D:/pythonwork/Adversarial/Cifar10_keras_Adversarial/MisClassification.py", line 100, in
main()
File "D:/pythonwork/Adversarial/Cifar10_keras_Adversarial/MisClassification.py", line 21, in main
kmodel = load_model(weight_path)
File "D:\Users\Python\lib\site-packages\keras\engine\saving.py", line 261, in load_model
model = model_from_config(model_config, custom_objects=custom_objects)
File "D:\Users\Python\lib\site-packages\keras\engine\saving.py", line 335, in model_from_config
return deserialize(config, custom_objects=custom_objects)
File "D:\Users\Python\lib\site-packages\keras\layers_init_.py", line 55, in deserialize
printable_module_name='layer')
File "D:\Users\Python\lib\site-packages\keras\utils\generic_utils.py", line 145, in deserialize_keras_object
list(custom_objects.items())))
File "D:\Users\Python\lib\site-packages\keras\engine\sequential.py", line 292, in from_config
custom_objects=custom_objects)
File "D:\Users\Python\lib\site-packages\keras\layers_init_.py", line 55, in deserialize
printable_module_name='layer')
File "D:\Users\Python\lib\site-packages\keras\utils\generic_utils.py", line 165, in deserialize_keras_object
':' + function_name)
ValueError: Unknown layer:layers

not convergence

Network_in_Network_keras.py
Trainning accuracy is always around 1.0%.

error in SENet_Keras.py

Hi Wei Li. Thanks for sharing this great code! I learn a lot from it.
I found two error in SENet_Keras.py:
1, in line 77, you use y = add_common_layer(y). add_common_layer include BN and relu. from the office code https://github.com/hujie-frank/SENet. We can see at the end of resnet block, it is just BN. And you use a relu before the se-block. It's not intuitive. So in line 77. y = BatchNormalization(momentum=0.9, epsilon=1e-5)(y).
2, in line 48, you define the global variable inplanes. But you change the value of inplances in residual_layer. So you should define the global variable inplanes in residual_layer(line 89). You can visualize the current model structure. In each shortcut branch,there is a conv+bn.
Thanks again for sharing!

NIN训练问题

你好,对于NIN模型训练的时候,我第一次采用你的初始化方法,第二次采用he_normal。不过训练的结果还是差强人意,达不到lenet水平。请问一下是什么原因呢?而且我看到你并没有用mlpconv而是conv+relu+bn堆栈。
image

wrong num of parameters in readme

I found that both resent-20 and 110 have the same number of parameters 0.27M. Is there something wrong with that?

Residual-Network20 | GTX1080TI | 0.27M | 128 | 200 | 44 min | 91.82
Residual-Network32 | GTX1080TI | 0.47M | 128 | 200 | 1 h 7 min | 92.68
Residual-Network50 | GTX1080TI | 1.7M | 128 | 200 | 1 h 42 min | 93.18
Residual-Network110 | GTX1080TI | 0.27M | 128 | 200 | 3 h 38 min | 93.93

Thanks

Thanks, Wei Li. It helps me a lot!

my result is wrong, can you help me?

I just follow your code to do,but my model is not correct,why?
have any where i want to do or ?

i use your retrain.h5 to prediction like this:

model = VGG19(weights=None)
filepath1 = os.path.abspath('retrain.h5')
model.load_weights(filepath=filepath1, by_name=True)

this is my result:
Please input picture file to predict ( input Q to exit ): test_pic/tiger.jpeg
Predicted: [('n04200800', 'shoe_shop', 0.0059397803), ('n04462240', 'toyshop', 0.0048586507), ('n02640242', 'sturgeon', 0.0048460886), ('n12985857', 'coral_fungus', 0.0044603818), ('n03063689', 'coffeepot', 0.0042976518)]

i don't know why...

about learning_rate

Can the callback function pass the learning rate to the optimizer continuously during the iteration? The learning rate is already available when the optimizer is set.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.