Adversarial Alignment: breaking the trade-off between the strength of an attack and its relevance to human perception

Drew Linsley*, Pinyuan Feng*, Thibaut Boissin, Alekh Karkada Ashok, Thomas Fel, Stephanie Olaiya, Thomas Serre

Read our paper »

Website · Results · Model Info · Harmonization · ClickMe · Serre Lab @ Brown

Dataset

We did our experiments on ClickMe dataset, a large-scale effort for capturing feature importance maps from human participants that highlight parts that are relevant and irrelevant for recognition. We created a subset of ClickMe, one image per category, in our experiment. If you want to replicate our experiment, please put the TF-Record file in ./datasets.

Environment Setup

conda create -n adv python=3.8 -y
conda activate adv
conda install pytorch==1.13.1 torchvision==0.14.1 pytorch-cuda=11.7 -c pytorch -c nvidia
pip install tensorflow==2.12.0
pip install timm==0.8.10.dev0
pip install harmonization
pip install numpy matplotlib scipy tqdm pandas

Implementations

You can enter the following command in Terminal

python main.py --model "resnet" --cuda 0 --spearman 1

Google Colab notebook
- You can run 2 .ipynb files if you have installation issues. Please check the folder ./scripts

Images

There are 10 example images in ./images.
The images contains ImageNet images, human feature importance maps from ClickMe, and adversarial attacks for a variety of DNNs.

Models

In our experiment, 283 models have been tested
- 125 PyTorch CNN models from timm library
- 121 PyTorch ViT models from timm library
- 15 PyTorch ViT/CNN hybrid architectures from timm library
- 14 Tensorflow Harmonized models from harmonizatin library
- 4 Baseline models
- 4 models that were trained for robustness to adversarial example
The Top-1 ImageNet accuracy for each model refers to Hugging Face results

Citation

If you use or build on our work as part of your workflow in a scientific publication, please consider citing the official paper:

@article{linsley2023adv,
  title={Adversarial Alignment: breaking the trade-off between the strength of an attack and its relevance to human perception},
  author={Linsley, Drew and Feng, Pinyuan and Boissin, Thibaut and Ashok, Alekh Karkada and Fel, Thomas and Olaiya Stephanie and Serre, Thomas},
  year={2023}
}

If you have any questions about the paper, please contact Drew at drew_linsley@brown.edu.

Acknowledgement

This paper relies heavily on previous work from Serre Lab, notably Harmonization and ClickMe.

@article{fel2022aligning,
  title={Harmonizing the object recognition strategies of deep neural networks with humans},
  author={Fel, Thomas and Felipe, Ivan and Linsley, Drew and Serre, Thomas},
  journal={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2022}
}

@article{linsley2018learning,
  title={Learning what and where to attend},
  author={Linsley, Drew and Shiebler, Dan and Eberhardt, Sven and Serre, Thomas},
  journal={International Conference on Learning Representations (ICLR)},
  year={2019}
}

License

The code is released under MIT license

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
adversarial_alignment		adversarial_alignment
docs		docs
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
mkdocs.yaml		mkdocs.yaml
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adversarial Alignment: breaking the trade-off between the strength of an attack and its relevance to human perception

Dataset

Environment Setup

Implementations

Images

Models

Citation

Acknowledgement

License

About

Releases

Packages

Contributors 2

Languages

License

serre-lab/Adversarial-Alignment

Folders and files

Latest commit

History

Repository files navigation

Adversarial Alignment: breaking the trade-off between the strength of an attack and its relevance to human perception

Dataset

Environment Setup

Implementations

Images

Models

Citation

Acknowledgement

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages