Skip to content

Latest commit

 

History

History
68 lines (37 loc) · 4.05 KB

README.md

File metadata and controls

68 lines (37 loc) · 4.05 KB

DIGIT RECOGNITION

ABSTRACT

I trained the model to investigate digit recognition using the CNN algorithm with the Mnist dataset. MNIST is one of the common datasets used to train models to recognize handwritten numbers. The data set contains 10 data sets from 0 to 9. I used CNN because it is a very successful algorithm in image classification.


DATASET

The MNIST (Modified National Institute of Standards and Technology) dataset is created to recognize individual digits. The MNIST dataset had created by remixing some datasets of the NIST . In the MNIST dataset we have 70000 images of handwritten numbers resized to 28×28 and converted to grayscale

image

How computer sees the data!

image

How computer sees the normalized data!

image

WHAT IS THE CNN (Convolutional Neural Network)

CNN stands for Convolutional Neural Network. CNN consists of 4 hidden layers which help in extraction of the features from the images and can predict the result. The layers of CNN are (a) Convolutional Layer (b) ReLu Layer (c) Pooling Layer (d) Fully Connected Layer. Reason we are using CNN is because the fundamental favorable position of CNN contrasted with its predecessors is that it consequently recognizes the significant highlights with no human management.

Convolution Layer

Convolutional layer is a simple application of a filter which acts as an activation function. What this does is takes a feature from a input image, then filter different features from that image and makes a feature map. Some of the features are location, strength etc. the filter is then moved over the whole image and the value of each pixel is calculated.

image

Sample Filters

image

Sample Convolution Layer Feature Maps

image

Pooling Layer

The main function of this layer is to reduce the image size. This is done to facilitate computational speed and reduce computational cost.What this layer basically does is to take a 2 x 2 matrix and a step of 1 (moving from one pixel to another) and move the window across the entire image. The highest value is taken in each of the windows and this process is repeated for each part of the image. In summary, before the pooling layer we had a 26 x 26 matrix, after the pooling layer the image matrix changed to a 13 x 13 matrix.

image

Sample Pooling Layer Feature Maps

image

Flattening Layer

image

Fully Connected Layer

This is the last layer of CNN. This is the part where the actual classification happens. All the matrix from the pooling layer is stacked up here and put into a single list.

image

Model Architecture

image

Image of The Image In Layers

image

Result ==> Accuracy = 0.9921

CONFUSING MATRIX

4b959692-3e19-4638-a829-8a3ebdd73d40