Designing a neural network can be a complex task. There are various steps included while designing a neural network plus various regularization techniques, and it can be confusing to know what technique to use, where to use it and when. The number of parameters in a network depends on the dataset you are using. Here is step-by-step approach on how to design a Convolutional Neural Network using MNIST Digit Recognizer dataset
Experiments | Model Stats | Analysis |
---|---|---|
1: The Basic Skeleton |
|
|
2: With Data Augmentation |
|
|
3: Best Model with LR Schedulers |
|
|
Check out here, here the MNIST State of the Art results the top model Branching/Merging CNN + Homogeneous Vector Capsules was able to achive 99.87% accuracy with 1,514,187 Trainable Parameters and we are at 99.53% with just 7.4k parameters. NOTE: i didnt mean to compare nor stating that it is better just stating the importance of parameters while creting a network, it may be obvious that for more accuracy more parameters may be required.
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 8, 26, 26] 72
BatchNorm2d-2 [-1, 8, 26, 26] 16
ReLU-3 [-1, 8, 26, 26] 0
Dropout-4 [-1, 8, 26, 26] 0
Conv2d-5 [-1, 16, 24, 24] 1,152
BatchNorm2d-6 [-1, 16, 24, 24] 32
ReLU-7 [-1, 16, 24, 24] 0
Dropout-8 [-1, 16, 24, 24] 0
MaxPool2d-9 [-1, 16, 12, 12] 0
Conv2d-10 [-1, 8, 12, 12] 128
BatchNorm2d-11 [-1, 8, 12, 12] 16
ReLU-12 [-1, 8, 12, 12] 0
Conv2d-13 [-1, 12, 10, 10] 864
BatchNorm2d-14 [-1, 12, 10, 10] 24
ReLU-15 [-1, 12, 10, 10] 0
Dropout-16 [-1, 12, 10, 10] 0
Conv2d-17 [-1, 16, 8, 8] 1,728
BatchNorm2d-18 [-1, 16, 8, 8] 32
ReLU-19 [-1, 16, 8, 8] 0
Dropout-20 [-1, 16, 8, 8] 0
Conv2d-21 [-1, 20, 6, 6] 2,880
BatchNorm2d-22 [-1, 20, 6, 6] 40
ReLU-23 [-1, 20, 6, 6] 0
Dropout-24 [-1, 20, 6, 6] 0
AvgPool2d-25 [-1, 20, 1, 1] 0
Conv2d-26 [-1, 16, 1, 1] 320
BatchNorm2d-27 [-1, 16, 1, 1] 32
ReLU-28 [-1, 16, 1, 1] 0
Dropout-29 [-1, 16, 1, 1] 0
Conv2d-30 [-1, 10, 1, 1] 160
================================================================
Total params: 7,496
Trainable params: 7,496
Non-trainable params: 0
----------------------------------------------------------------
OPERATION | Nin | Nout | CHin | CHout | Padding | Kernel | Stride | jin | jout | rin | rout |
---|---|---|---|---|---|---|---|---|---|---|---|
CONVLUTION | 28 | 26 | 1 | 8 | 0 | 3 | 1 | 1 | 1 | 1 | 3 |
CONVLUTION | 26 | 24 | 8 | 16 | 0 | 3 | 1 | 1 | 1 | 3 | 5 |
MaxPool | 24 | 12 | 16 | 16 | 0 | 2 | 2 | 1 | 2 | 5 | 6 |
CONVLUTION | 12 | 12 | 16 | 8 | 0 | 1 | 1 | 2 | 2 | 6 | 6 |
CONVLUTION | 12 | 10 | 8 | 12 | 0 | 3 | 1 | 2 | 2 | 6 | 10 |
CONVLUTION | 10 | 8 | 12 | 16 | 0 | 3 | 1 | 2 | 2 | 10 | 14 |
CONVLUTION | 8 | 6 | 16 | 20 | 0 | 3 | 1 | 2 | 2 | 14 | 18 |
GAP | 6 | 1 | 20 | 20 | 0 | 6 | 1 | 2 | 2 | 18 | 28 |
CONVLUTION | 1 | 1 | 20 | 16 | 0 | 1 | 1 | 2 | 2 | 28 | 28 |
CONVLUTION | 1 | 1 | 16 | 10 | 0 | 1 | 1 | 2 | 2 | 28 | 28 |
Epoch 1
Train: Loss=0.1293 Batch_id=468 Accuracy=80.89: 100%|██████████| 469/469 [00:37<00:00, 12.36it/s]
Test set: Average loss: 0.1201, Accuracy: 9727/10000 (97.27%)
Epoch 2
Train: Loss=0.0514 Batch_id=468 Accuracy=96.77: 100%|██████████| 469/469 [00:38<00:00, 12.26it/s]
Test set: Average loss: 0.0631, Accuracy: 9819/10000 (98.19%)
Epoch 3
Train: Loss=0.0652 Batch_id=468 Accuracy=97.82: 100%|██████████| 469/469 [00:38<00:00, 12.28it/s]
Test set: Average loss: 0.0387, Accuracy: 9885/10000 (98.85%)
Epoch 4
Train: Loss=0.0647 Batch_id=468 Accuracy=98.18: 100%|██████████| 469/469 [00:37<00:00, 12.47it/s]
Test set: Average loss: 0.0311, Accuracy: 9897/10000 (98.97%)
Epoch 5
Train: Loss=0.0878 Batch_id=468 Accuracy=98.49: 100%|██████████| 469/469 [00:39<00:00, 11.92it/s]
Test set: Average loss: 0.0331, Accuracy: 9905/10000 (99.05%)
Epoch 6
Train: Loss=0.0864 Batch_id=468 Accuracy=98.66: 100%|██████████| 469/469 [00:38<00:00, 12.21it/s]
Test set: Average loss: 0.0274, Accuracy: 9921/10000 (99.21%)
Epoch 7
Train: Loss=0.0709 Batch_id=468 Accuracy=98.81: 100%|██████████| 469/469 [00:38<00:00, 12.18it/s]
Test set: Average loss: 0.0256, Accuracy: 9926/10000 (99.26%)
Epoch 8
Train: Loss=0.0346 Batch_id=468 Accuracy=98.85: 100%|██████████| 469/469 [00:38<00:00, 12.30it/s]
Test set: Average loss: 0.0252, Accuracy: 9922/10000 (99.22%)
Epoch 9
Train: Loss=0.0301 Batch_id=468 Accuracy=99.00: 100%|██████████| 469/469 [00:38<00:00, 12.24it/s]
Test set: Average loss: 0.0237, Accuracy: 9923/10000 (99.23%)
Epoch 10
Train: Loss=0.1060 Batch_id=468 Accuracy=99.06: 100%|██████████| 469/469 [00:38<00:00, 12.33it/s]
Test set: Average loss: 0.0220, Accuracy: 9934/10000 (99.34%)
Epoch 11
Train: Loss=0.0260 Batch_id=468 Accuracy=99.14: 100%|██████████| 469/469 [00:38<00:00, 12.29it/s]
Test set: Average loss: 0.0206, Accuracy: 9947/10000 (99.47%)
Epoch 12
Train: Loss=0.0091 Batch_id=468 Accuracy=99.22: 100%|██████████| 469/469 [00:39<00:00, 11.82it/s]
Test set: Average loss: 0.0175, Accuracy: 9950/10000 (99.50%)
Epoch 13
Train: Loss=0.0203 Batch_id=468 Accuracy=99.37: 100%|██████████| 469/469 [00:38<00:00, 12.28it/s]
Test set: Average loss: 0.0183, Accuracy: 9948/10000 (99.48%)
Epoch 14
Train: Loss=0.0071 Batch_id=468 Accuracy=99.39: 100%|██████████| 469/469 [00:38<00:00, 12.24it/s]
Test set: Average loss: 0.0171, Accuracy: 9952/10000 (99.52%)
Epoch 15
Train: Loss=0.0211 Batch_id=468 Accuracy=99.42: 100%|██████████| 469/469 [00:38<00:00, 12.22it/s]
Test set: Average loss: 0.0174, Accuracy: 9953/10000 (99.53%)