mini_cnn is a light-weight convolutional neural network implementation based on c++11, mutli threading and head only
- mutli threading
- gradient checking for all layer weights/bias
- weight initializer
- xavier initialize
- he initializer
- layer-types
- fully connected layer
- convolutional layer
- activation layer
- flatten layer
- softmax loglikelihood output layer
- sigmod cross entropy output layer
- average pooling layer
- max pooling layer
- dropout layer
- batch normalization layer
- activation functions
- sigmoid
- softmax
- rectified linear(relu)
- loss functions
- mean squared error
- cross-entropy
- loglikelihood
- optimization algorithms
- stochastic gradient descent
- fast convolution(im2col + gemm)
- fast convolution(winograd)
- train on gpu
- more optimization algorithms such as adagrad,momentum etc
- serilize/deserilize
train mnist dataset
2-layer conv on mnist dataset
conv 3x3x32 relu
|
maxpool 2x2
|
conv 3x3x64 relu
|
maxpool 2x2
|
fc 1024
|
log-likelihood softmax 10
network create_cnn()
{
network nn;
nn.add_layer(new input_layer(W_input, H_input, D_input));
nn.add_layer(new convolutional_layer(3, 3, 1, 32, 1, 1, padding_type::eValid, activation_type::eRelu));
nn.add_layer(new max_pooling_layer(2, 2, 2, 2));
nn.add_layer(new convolutional_layer(3, 3, 32, 64, 1, 1, padding_type::eValid, activation_type::eRelu));
nn.add_layer(new max_pooling_layer(2, 2, 2, 2));
nn.add_layer(new fully_connected_layer(1024, activation_type::eRelu));
nn.add_layer(new output_layer(C_classCount, lossfunc_type::eSoftMax_LogLikelihood, activation_type::eSoftMax));
return nn;
}
more details in main.cpp
[1] Neural Networks and Deep Learning by By Michael Nielsen
[2] Deep Learning, book by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
[3] http://cs231n.github.io/convolutional-networks
[4] http://cs231n.stanford.edu/slides/2018/cs231n_2018_lecture05.pdf
[5] http://ufldl.stanford.edu/tutorial/supervised/ConvolutionalNeuralNetwork/
[6] http://deeplearning.net/software/theano_versions/dev/tutorial/conv_arithmetic.html#transposed-convolution-arithmetic
[7] Gradient checking
[8] 2D Max Pooling Backward Layer
[9] https://blog.csdn.net/mrhiuser/article/details/52672824
[10] https://kevinzakka.github.io/2016/09/14/batch_normalization/