Create different types of adversarial examples to fool deep neural networks into misclassifying data.
Adversarial Attack is the methodology used to trick deep neural networks to misclassify data by slightly perturbing the original data. These perturbations are so small that they aren't even visible to human eye.
In this project I implement three types of attacks on MNIST dataset which are implemented based on the following papers:
- FGSM - Link to paper
- Deepfool Attack - Link to paper
- L-BFGS Attack - Link to paper
I have also built two simple defence techniques against a particular type of adversarial attack:
- Adversarial training
- APE-GAN - Link to paper
Since I have used only a notebook format to implement the project there are two ways you can run this project:
Google Collab notebook
The Hassle free approach
All the necessary libraries are downloaded as part of the notebook and uses its own compute power to run the project.
Clone the repo and run Jupyter notebook on your system
The tensorflow version I use only makes use of CPU so the following two software installations arent mandatory. You may get a warning if your system does not have a NVIDIA GPU, while running the code but it doesnt impact the models that are built.
- Clone the repo onto your system
git clone https://github.com/ACM40960/project-21200461.git
- The repo you download should have this file structure on your system
.
├── images
├── src
│ └── adversarial_attack.ipynb
├── LICENSE.md
├── README.md
└── requirements.txt
- Launch command prompt from the directory where the repo is installed and run the following command to install all the required libraries
pip install -r requirements.txt
- Launch the adversarial_attack.ipynb in the src folder either by double clicking on it(if Jupyter is the default software for opening ipynb on your system) or run the following command from the src folder to launch jupyter.
jupyter notebook
- Ways to run the project on Colab/ Jupyter is illustrated below. Please note using Run All will take awhile for the entire notebook to finish running. (Code section to build CNN ~ 10-15 minutes(depends on system), adversarial training on 5000 images ~ 10-15 minutes, APE-GAN training 60-120 minutes for 10 epochs)
- Sections of code that fits the images for predictions:
cnn_model_fit = cnn_model.fit(x_train,
y_train,
validation_data = (x_val, y_val),
batch_size=128,
epochs=10)
- Various attacks and an example of results:
perturbed_image = FGSM(cnn_model, image, label, eps=0.1)
label_pert, pert_image = DeepFool(image, cnn_model)
perturbed_image = LBFGS(cnn_model,image,actual_label)
Distributed under the MIT License. See LICENSE.md
for more information.