Skip to content

Latest commit

 

History

History
56 lines (42 loc) · 1.74 KB

README.md

File metadata and controls

56 lines (42 loc) · 1.74 KB

Backdoor in a MNIST CNN model

A workflow to infect a PyTorch digit recognition CNN with a backdoor. Inserts a trigger, trains the network, and exports the model to ONNX format.

Steps:

  1. MNIST dataset is downloaded from PyTorch repo
  2. A model is trained or a pretrained one used
  3. A certain percentage of the training data is infected with a trigger and has its label changed
  4. Upon using the infecting model, clean inputs yield expected inference - but with trigger yields bad predictions

Usage

  1. Create a virtual environment and install dependencies.
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
  1. Training a clean model and saving it
python mnist.py --save-model
  1. Infecting a model with a backdoor
python mnist.py --save-model --infection-rate=0.3
  1. Converting the model to ONNX to be used in the demo
python export_onnx.js ./mnist_cnn.pt

Extra

If you want to export some test data, use:

python export_dataset_imgs.py

Which will save image file samples to the ./data/ folder.

It is also possible to run the entire project on Peregrine. For this, upload the /backdoor folder to Peregrine (e.g. through git), and in this folder run:

sbatch train-peregrine.txt

Which will launch a job to train the model on Peregrine using the GPU nodes.

About

Inspired by: