A Generative Adversarial Network to create mugshots
A Generative Adversarial Network is a model of machine learning used to generate fake data from a dataset. It can be used in various fields, such as creating images of people who don't exist see thispersondoesnotexist.com
There are two main parts in a GAN :
- The discriminator :
The discriminator networks are the most used neural networks in the field of Artificial Intelligence. Those are networks that are trained to recognise some type of data and extract an information out of them. For example, an IA that will classify plants such as G.R.O.O.T. is most likely built with a discriminator network.
- The generator :
This is the most important part of the GAN. It will create data from a random noise, and will be trained against the discriminator in order to build more realistic data over time.
The way GANs work (most of the time at least) is the following: The Generator is given an array of noise, and processes this array to spit out an array of imformation that it thinks is a valid representation of true data. The discriminator is then fed this data and tries to figure out whether it is true data or false data. The discriminator may be fed data from the dataset as well as data from the Generator. The idea behind is to make both the networks compete against each other to get better and better at generating or recognizing fake data.
During the training, the output of the discriminator will go through an loss function, and the result will be used to update both the discriminator neurons and the generator neurons. The generator will therefore be trained to replicate data similar to the dataset, in order to fool the discriminator. The diagrams of the first draft may help understanding how the networks work together.
Once the generator is trained, it is easy to gather the data it produces, and tranform them into useful data. In our first draft, the data will simply be used as x, y coordinates, and should (If the GAN has been correctly trained) resemble a sine wave. In our second case, the output will be used as gray values for a pixel.
This first draft will be an easy one for two reasons:
- We can generate an infinite dataset
- The networks won't be extremely large as the data we're working on are just two digits
For this part, we'll first need a dataset of mugshots, on which we will train the networks. I found a good dataset on kaggle. From this, we'll need to select only the ones we want, in our case, the pictures taken from the front. To do this, we simply need to iterate through the folders, and take the pictures with an 'F' just before the .png. This is the purpose of this bit of code :
for filepath in glob.glob(f"{images_folder}/*/*F.png"):
image = PIL.Image.open(filepath)
if self.transform is not None:
image = self.transform(image)
self.df.append((image, os.path.split(filepath)[1].split(".")[0]))
Located in our CustomDataset
class. These images then go through some transformations in order to have a