Handwritten digits recognition written in C using neural network trained with MNIST database.
.AppImage
file is available under release section.
sudo apt install libopenblas-dev
- installs OpenBLAS librarysudo apt install libgtk-3-dev
- installs- if you don't have any trained data file
./lib/ceural/data.ceural
, copy sample one usingcp ./data/data.ceural ./lib/ceural/data.ceural
- compilation:
- use
cd src && make clean && make main && ./main
to run normal compilation - use
make clean && make release
to generate AppImage binary (you have to installlinuxdeploy
and other dependecies usingmake install_tools
first)
- use
left mouse button & drag
to drawright mouse button
to clear the draw spacemiddle mouse button or Recognise button in the GUI
to run recognition process
Preprocessing used in MNIST database: The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.
- crop calculation - crop of the whole draw space is calculated from sides until it reaches non white pixels. After that maximum of the width & height of the cropped image is then taken and
crop_x
&crop_y
&crop_w
&crop_h
is recalculated to preserve image ratio. - sub image generation - using previous values and drawn image stored in pixbuf sub image is created
- scaling - previous cropped image is scaled to the
20x20
image - conversion into grayscale - pixbuf is converted into
uint8_t
grayscale image - adding frame -
4, 4, 4, 4
frame is added to the20x20
image resulting into28x28
image - computation of the center of mass of the pixels - is done using mean values accross X & Y
- move of the submatrix - submatrix (drawn number) is moved in the framed image
- neural network forward propagation - this preprocessed image is fed to the neural network
Dependency of the libraries is in this order: GUI -> ceural -> lag
. For documentation see source code or use IDE (for example vscode).
lag
- Linear Algebra libraryceural
- C neural network library
Library supports many operations but more development is needed because currently uses OpenBLAS only for matrix multiplication and matrix transposition.
mat
- stands for matrixew
- stands for element wise
- Matrix part of the library automatically checks if destination and source is same where shouldn't be same and warns using
assert()
.
Ceural library is created for multi-layer networks trained using MNIST dataset but with small modifications it can be used for other datasets too. See Accuracy for more info.
After 10
epochs of training with batch size 32
the test set accuracy is 97.47 %
which is not bad considering the test error rate in MNIST database website of the 2-layer NN. Sadly accuracy is not as good in practice as it's in the test data set 🥺.
Accuracy is calculated using formula accuracy = (TP+TN)/(TP+TN+FP+FN)
which is accuracy = correct/total
Even though Python is much slower than C, Python-digit-recognition is faster. The reason behind it is that Python version uses great library NumPy, which is perfectly optimized.
- add
lag
tests - add
ceural
tests - Use BLAS (for example OpenBLAS) library for linear algebra in more functions to improve speed
- Add icons into
gui
- Add command line options to train & test & save & load NN
- Create
lag
&ceural
docs - Choose license
- Create Windows compilation script & test it on Windows
- Center digit by center of mass of the pixels before feeding it to the neural network from GUI input
- Look into possible accuracy improvements