Neuro Evolution in Julia

Training a neural network to classify images of digits from the MNIST dataset without the use of backpropagation but with an evolution strategy. The strategy used in this example is a simplified version of the one developed by OpenAi and publicated on their blog and on arXiv

Prerequisites

To run this project you need: Julia v1.0 or higher and the following Packages:

MLDatasets
Plots
Printf

Getting Started

The most common approaches to train a neural network make use of backpropagation to track the gradient. Another way is to employ evolution strategies, when it comes to supervised learning evolution approaches are inefficient compared to backpropagation but their freedom from derivability make them suitable for complex scenarios when it is impossible to track gradient (like Reinforcement learning).
Anyway in order to update the parameters of our network we need something similar to the gradient that tells us the direction and the magnitude of the changes that we should apply to each parameter to get a better model. Instead of the real gradient (which can be computed only through backpropagation) we will compute an approximation and then we use it to update our model's parameters through Stochastic Gradient Descent (SGD).

Intuition

At first we create one model initialized with random parameters sampled from a normal distribution of mean and std known.
Then applying random mutations to the model we make a population, a set of models slighty different from each others.
We measure how each model performed over the same batch of data and we store this information into the fitness vector, we normalize the vector by subtracting it's mean and dividing for its standard deviation to get the advantage vector
Then we approximate the gradient as a weighted sum between each random noise applied and the advantage of moving in that direction.
The model's paramaters are then updated by Stochastic Gradient Descent.

Notes

We need to keep track of the random noise used for each model of the population in order to approximate gradient.
The random noise can be thought as a movement in a certain direction in the complex paramaters space of the model.
By keeping track of all the directions that we tried in the population (keeping track of the random noise) and knowing the advantage of every movement we took we can compute an approximation of the gradient

Implementation

I could have used the packages Statistics, Flux, Lathe and avoid writing all the functions myself in the "utils.jl" file but I wanted to show how powerful is Julia and how easy is to implement everything by yourself.

Results

In the results directory there are the output of a 100 epoch training that took about 40 mins on my laptop and reached an accurcy of 89% on the test set. I also put into the directory "examples" some random images from the test set and the model predictions for each of them.

Authors

Fabio Cescon - GitHub

Acknowledgments

A wonderful project made by Dennis G. Wilson. It's been my starting point into the world of evolutional computing.
Julia
OpenAI

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
results		results
Project.toml		Project.toml
main.jl		main.jl
readme.md		readme.md
utils.jl		utils.jl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neuro Evolution in Julia

Prerequisites

Getting Started

Intuition

Notes

Implementation

Results

Authors

Acknowledgments

About

Releases

Packages

Languages

cesch97/NeuroEvolution

Folders and files

Latest commit

History

Repository files navigation

Neuro Evolution in Julia

Prerequisites

Getting Started

Intuition

Notes

Implementation

Results

Authors

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages