Skip to content

Latest commit

 

History

History
96 lines (66 loc) · 4.41 KB

README.md

File metadata and controls

96 lines (66 loc) · 4.41 KB

DISCLAIMER

We are currently investigating an issue with our uploaded model weights due to which there is some discrepancy with the reported numbers in the paper. We will update soon.

Measuring Style Similarity in Diffusion Models

Check out the paper here - arxiv.

alt text

Create and activate the environment

conda env create -f environment.yml
conda activate style

Download the pretrained weights for the CSD model

If you want to download the model directly, the CSD model (ViT-L) weights here. If you are using huggingface, the CSD model (ViT-L) is here.

Download the pretrained weights for the baseline models

You need these only if you want to test the baseline numbers. For CLIP and DINO, pretrained weights will be downloaded automatically. For SSCD and MoCo, please download the weights from the links below and put them in ./pretrainedmodels folder.

Download the WikiArt dataset

WikiArt can be downloaded from here or here1

After dataset is downloaded please put ./wikiart.csv in the parent directory of the dataset. The final directory structure should look like this:

path/to/WikiArt
├── wikiart
    ├── Abstract_Expressionism
        ├── <filename>.jpg
    ├── ...
└── wikiart.csv

Also, make sure that you add a column path in the wikiart.csv file which contains the absolute path to the image.

Generate the embeddings

Once WikiArt dataset is set up, you can generate the CSD embeddings by running the following command. Please adjust the --data-dir and --embed_dir accordingly. You should also adjust the batch size --b and number of workers --j according to your machine. The command to generate baseline embeddings is same, you just need to change the --pt_style with any of the following: clip, dino, sscd, moco.

python main_sim.py --dataset wikiart -a vit_large --pt_style csd --feattype normal --world-size 1 
--dist-url tcp://localhost:6001 -b 128 -j 8 --embed_dir ./embeddings --data-dir <path to WikiArt dataset>
--model_path <path to CSD weights>

Evaluate

Once you've generated the embeddings, run the following command:

python search.py --mode artist --dataset wikiart --chunked --query-chunk-dir <path to query embeddings above> 
    --database-chunk-dir <path to database embeddings above> --topk 1 10 100 1000 --method IP --data-dir <path to WikiArt dataset>

Train CSD on LAION-Styles

You can also train style descriptors for your own datasets. A sample code for training on LAION-styles dataset is provided below.

We have started to release the Contra-Styles (referred to as LAION-Styles in the paper) dataset. The dataset is available here and will keep getting updated over the next few days as we are running profanity checks through NSFW and PhotoDNA. We will update here once the dataset has been completely uploaded.

export PYTHONPATH="$PWD:$PYTHONPATH"

torchrun --standalone --nproc_per_node=4 CSD/train_csd.py --arch vit_base -j 8 -b 32 --maxsize 512 --resume_if_available --eval_k 1 10 100 --use_fp16 --use_distributed_loss --train_set laion_dedup --train_path <PATH to LAION-Styles> --eval_path <PATH to WikiArt/some val set>  --output_dir <PATH to save checkpoint>

Pending items

We will soon release the code to compute the artists' prototypical style representations and compute similarity score against any given generation. ETA end of June'24.

Cite us

@article{somepalli2024measuring,
  title={Measuring Style Similarity in Diffusion Models},
  author={Somepalli, Gowthami and Gupta, Anubhav and Gupta, Kamal and Palta, Shramay and Goldblum, Micah and Geiping, Jonas and Shrivastava, Abhinav and Goldstein, Tom},
  journal={arXiv preprint arXiv:2404.01292},
  year={2024}
}