Backdoor in a MNIST CNN model

A workflow to infect a PyTorch digit recognition CNN with a backdoor. Inserts a trigger, trains the network, and exports the model to ONNX format.

Steps:

MNIST dataset is downloaded from PyTorch repo
A model is trained or a pretrained one used
A certain percentage of the training data is infected with a trigger and has its label changed
Upon using the infecting model, clean inputs yield expected inference - but with trigger yields bad predictions

Usage

Create a virtual environment and install dependencies.

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Training a clean model and saving it

python mnist.py --save-model

Infecting a model with a backdoor

python mnist.py --save-model --infection-rate=0.3

Converting the model to ONNX to be used in the demo

python export_onnx.js ./mnist_cnn.pt

Extra

If you want to export some test data, use:

python export_dataset_imgs.py

Which will save image file samples to the ./data/ folder.

It is also possible to run the entire project on Peregrine. For this, upload the /backdoor folder to Peregrine (e.g. through git), and in this folder run:

sbatch train-peregrine.txt

Which will launch a job to train the model on Peregrine using the GPU nodes.

About

Inspired by:

ShihaoZhaoZSH/BadNet
Kooscii/Badnets
BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain (Gu et al, 2019)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Backdoor in a MNIST CNN model

Usage

Extra

About

Files

README.md

Latest commit

History

README.md

File metadata and controls

Backdoor in a MNIST CNN model

Usage

Extra

About