With the growing era of social media, it is difficult to identify the real from fake whether it is any news or face/video of any celebrity, politician etc. Also, the fake or manipulated faces and videos are being generated enormously which are harder to detect by traditional means of software or methods. Therefore, Deep Learning which is a subset of Machine Learning can be employed to identify the real or fake images/faces/videos efficiently. We are going to detect the faces as real or fake from the given dataset. As there are already multiple images with faces which are generated by various softwares in the dataset. In this project, we will use Convolutional Neural Network (CNN) to detect the fake faces from an online database (https://www.kaggle.com/xhlulu/140k-real-and-fake-faces), we can also discuss about the various datasets and their usage. The images are fed into the detector, and it will then detect whether the image that is fed into it is real or fake.
This is indeed a very interesting project but requires in depth study of deep learning, neural networks.
The following links may help you in better understanding:
Convolutional Neural Network: https://towardsdatascience.com/convolutional-neural-networks-explained-9cc5188c4939
Deep Learning for Face Recognition: https://machinelearningmastery.com/introduction-to-deep-learning-for-face-recognition/
This dataset consists of all 70k REAL faces from the Flickr dataset collected by Nvidia, as well as 70k fake faces sampled from the 1 Million FAKE faces (generated by StyleGAN) that was provided by Bojan.
In this dataset, I convenient combined both dataset, resized all the images into 256px, and split the data into train, validation and test set. I also included some CSV files for convenience.
For more details, check out those threads:
Thread for real faces dataset: https://www.kaggle.com/c/deepfake-detection-challenge/discussion/122786 1 Million Fake faces: https://www.kaggle.com/c/deepfake-detection-challenge/discussion/121173
Build a DL(Deep Learning) model that can predict, with a high Accuracy score, whether an image is real (from Flickr) or fake (gan-generated)
Dataset: https://www.kaggle.com/xhlulu/140k-real-and-fake-faces