Skip to content

Latest commit

 

History

History
70 lines (54 loc) · 3.23 KB

README.md

File metadata and controls

70 lines (54 loc) · 3.23 KB

Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk

Official code for SatML'24 Paper: "Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk". Zhangheng Li, Junyuan Hong, Bo Li, Zhangyang Wang.

paper / code / blog

Overview

featured While diffusion models have recently demonstrated remarkable progress in generating realistic images, privacy risks also arise: published models or APIs could generate training images and thus leak privacy-sensitive training information. In this paper, we reveal a new risk, Shake-to-Leak (S2L), that fine-tuning the pre-trained models with manipulated domain-specific data can amplify the existing privacy risks. When prompted with `a photo of Joe Biden', the diffusion model will not leak the private images but many images will be leaked after S2L fine-tuning of the model. On the right side, we show the main steps of S2L where S2L is generally applicable with variant fine-tuning and attacking methods. (1) S2L first generates a synthetic private set P using the pre-trained diffusion model. (2) Then, S2L fine-tunes the pre-trained diffusion model on P using existing fine-tuning methods. After S2L, the attacker can extract private information via existing attacking methods.

Get Started

Prepare environment.

conda create -n s2l python=3.8 && conda activate s2l
git clone https://github.com/VITA-Group/Shake-to-Leak
cd Shake-to-Leak
pip install -r requirements.txt

Shake-to-Leak

The s2l fine-tuning experiments are conducted based on the peft library and SD-v1-1 Stable Diffusion model.

Step 1: Generate SP set

python sp_gen.py

Step 2: Finetuning model with the SP set by one of below methods. All commands are run under the experiment folder.

  • LoRA+DB:
./scripts/lora_db.sh <domain name, e.g.: "Joe Biden"> 
  • DB:
./scripts/db.sh <domain name, e.g.: "Joe Biden"> 
  • LoRA:
./scripts/lora.sh <domain name, e.g.: "Joe Biden"> 
  • End2End:
./scripts/end2end.sh <domain name, e.g.: "Joe Biden"> 
  • Batch Fine-tuning on All domains:
./scripts/batch_finetune.sh <script name, e.g.: lora_db.sh>

Step 3: Conduct attacks

./scripts/secmi_sd_laion.sh <domain name, e.g.: "Joe Biden">

#Batch MIA attack on All domains
./scripts/batch_mia_attack.sh
  • Data extraction which is implemented based on "Extracting Training Data from Diffusion Models" (Carlini, et al., 2023).
python data_extraction.py --domain=<domain name, e.g.: "Joe Biden">