Skip to content

Latest commit

 

History

History
36 lines (28 loc) · 1.62 KB

README.md

File metadata and controls

36 lines (28 loc) · 1.62 KB

MNIST Digit Recognition

Fully connected neural networks for handwritten digit classification using MNIST data

Description

Layers : 3 ( 200, 200, 10 )
Optimization : AMSGrad
Weight Initialization : He-normal
Regularization : L2
Activation : ( elu, elu, softmax )
Cost Function : Categorical cross-entropy
Total epochs : 100
Batch size : 250

Performance

Here are some performance stats using various optimizers:

Optimization Training Accuracy Testing Accuracy
Momentum 0.586 0.584
RMSProp 0.9870 0.9679
AMSGrad 0.9975 0.9743
ADAM(TensorFlow) 0.9967 0.9735

Notes

  • The latest commit has only AMSGrad. RMSProp and Momentum are present in previous commits.
  • The tensorflow code was for me to see the kind of performance I should be replicating. It uses AdamOptimization and not AMSGrad.
  • The performance stats aren't an average over x attempts, its just what I got running them once. Thus, they're not absolute or conclusive of anything and are subject to randoms. This point is specifically directed at my AMSGrad implementation vs Tensorflow's ADAM.

This was a fun exercise for me to see the impact of various optimizers and needless to say, I was absolutely blown away by the boost in performance caused by RMSProp and AMSGrad.
As another fun exercise, I have updated the repository with the tenserflow eager implementation of equivalent ADAM code.