Skip to content

The code for the paper "Contrastive Quantization with Code Memory for Unsupervised Image Retrieval" (AAAI'22, Oral).

Notifications You must be signed in to change notification settings

gimpong/AAAI22-MeCoQ

Repository files navigation

MeCoQ: Contrastive Quantization with Code Memory for Unsupervised Image Retrieval

[toc]

1. Introduction

This repository provides the code for our paper at AAAI 2022 (Oral):

Contrastive Quantization with Code Memory for Unsupervised Image Retrieval. Jinpeng Wang, Ziyun Zeng, Bin Chen, Tao Dai, Shu-Tao Xia. [arXiv].

We proposed MeCoQ, an unsupervised deep quantization method for image retrieval. Different from reconstruction-based methods that learn to preserve pairwise similarity information in continuous embeddings, MeCoQ learns quantized representation via contrastive learning. To boost contrastive learning, MeCoQ leverages a quantization code memory during training. Experiments on CIFAR-10 (under two evaluation protocols), Flickr-25K, and NUS-WIDE datasets demonstrate the effectiveness of MeCoQ.

In the following, we will guide you how to use this repository step by step. 🤗

2. Preparation

git clone https://github.com/gimpong/AAAI22-MeCoQ.git
cd AAAI22-MeCoQ/

2.1 Requirements

  • python 3.7.9
  • numpy 1.19.1
  • pandas 1.0.5
  • pytorch 1.3.1
  • torchvision 0.4.2
  • pillow 8.0.0
  • python-opencv 3.4.2
  • tqdm 4.51.0

2.2 Download the image datasets and organize them properly

Before running the code, we need to make sure that everything needed is ready. First, the working directory is expected to be organized as below:

AAAI22-MeCoQ/
  • data/
    • Flickr25k/
      • img.txt
      • targets.txt
    • Nuswide/
      • database.txt
      • test.txt
      • train.txt
  • datasets/
    • CIFAR-10/
      • cifar-10-batches-py/
        • batches.meta
        • data_batch_1
        • ...
    • Flickr25K/
      • mirflickr/
        • im1.jpg
        • im2.jpg
        • ...
    • NUS-WIDE/
      • Flickr/
        • actor/
          • 0001_2124494179.jpg
          • 0002_174174086.jpg
          • ...
        • administrative_assistant/
          • ...
        • ...
  • scripts/
    • run0001.sh
    • run0002.sh
    • ...
  • main.py
  • engine.py
  • data.py
  • utils.py
  • loss.py

Notes

  • The data/ folder is the collection of data splits for Flickr25K and NUS-WIDE datasets. The raw images of Flickr25K and NUS-WIDE datasets should be downloaded additionally and arranged in datasets/Flickr25K/ and datasets/NUS-WIDE/ respectively. Here we provide copies of these image datasets, you can download them via Google Drive or Baidu Wangpan (Web Drive, password: n307).

  • For experiments on CIFAR-10 dataset, you can use the option --download_cifar10 when running main.py.

3. Train and then evaluate

To facilitate reproducibility, we provide the scripts with configurations for each experiment. These scripts can be found under the scripts/ folder. For example, if you want to train and evaluate a 16-bit MeCoQ model on Flickr25K dataset, you can do

cd scripts/
# '0' is the id of GPU
bash run0001.sh 0

The script run0001.sh includes the running commands:

#!/bin/bash
cd ..
python main.py \
    --notes Flickr16bits \
    --device cuda:$1 \
    --dataset Flickr25K \
    --trainable_layer_num 0 \
    --M 2 \
    --feat_dim 32 \
    --T 0.4 \
    --hp_beta 1e-1 \
    --hp_lambda 0.5 \
    --mode debias --pos_prior 0.15 \
    --queue_begin_epoch 5 \
    --topK 5000
cd -

After running a script, a series of folders and files will be saved under logs/ and checkpoints/, whose file identifiers are consistent with the argument --notes in run0001.sh (e.g., Flickr16bits).

Under logs/ , there will be a log file (e.g., Flickr16bits.log) and a folder of tensorboard files (e.g., Flickr16bits).

Under checkpoints/, there will be a folder (e.g., Flickr16bits/) of information for the final checkpoint, including quantization codes (db_codes.npy) and labels (db_targets.npy) for the database set, model checkpoint (model.cpt), performance records (P_at_topK_curve.txt and PR_curve.txt).

⚠️Warning: the difficulty in reproducing exactly the same results on different software and hardware architectures 🤔

Initially, we tuned different experiments (e.g., different datasets and different quantization code lengths) separately on different servers in the authors' lab. These servers are equipped with 3 kinds of GPUs: NVIDIA® GeForce® GTX 1080 Ti (11GB), NVIDIA® GeForce® GTX 2080 Ti (11GB) and NVIDIA® Tesla® V100 (32 GB).

During our preparation for code releasing, we accidentally found that even with the same code and same hyper-parameter configuration (including the fixed random seeds), executing experiments on different servers can still yield different results. Such results may be influenced by various factors, e.g., the versions of drivers, libraries, and hardware architectures.

Unfortunately, we were not aware of this phenomenon during our paper submission and the reported results were based on mixed architectures. 😩

Here we report the results of running the scripts on three kinds of GPUs in the following table. We have also uploaded the logs and checkpoint information for reference, which can be downloaded from Baidu Wangpan (Web Drive), password: ncw0.

Script Dataset Code Length / bit Distance Computation GTX 1080 Ti GTX 2080 Ti V100
MAP log MAP log MAP log
run0001.sh Flickr25K 16 Asymmetric 81.3137 Flickr16bits.log 81.2682 Flickr16bits.log 81.6233 Flickr16bits.log
run0002.sh Symmetric 79.9250 Flickr16bitsSymm.log 80.0099 Flickr16bitsSymm.log 80.3065 Flickr16bitsSymm.log
run0003.sh 32 Asymmetric 82.3116 Flickr32bits.log 81.9112 Flickr32bits.log 81.0789 Flickr32bits.log
run0004.sh Symmetric 81.5173 Flickr32bitsSymm.log 81.1909 Flickr32bitsSymm.log 80.4656 Flickr32bitsSymm.log
run0005.sh 64 Asymmetric 82.6785 Flickr64bits.log 81.7833 Flickr64bits.log 78.2403 Flickr64bits.log
run0006.sh Symmetric 82.2351 Flickr64bitsSymm.log 81.2302 Flickr64bitsSymm.log 77.0577 Flickr64bitsSymm.log
run0007.sh CIFAR-10 (I) 16 Asymmetric 68.8245 CifarI16bits.log 68.3206 CifarI16bits.log 69.0129 CifarI16bits.log
run0008.sh Symmetric 65.9515 CifarI16bitsSymm.log 65.0148 CifarI16bitsSymm.log 66.1888 CifarI16bitsSymm.log
run0009.sh 32 Asymmetric 70.2410 CifarI32bits.log 69.9876 CifarI32bits.log 70.3119 CifarI32bits.log
run0010.sh Symmetric 69.1810 CifarI32bitsSymm.log 68.7357 CifarI32bitsSymm.log 69.1754 CifarI32bitsSymm.log
run0011.sh 64 Asymmetric 70.2445 CifarI64bits.log 70.2884 CifarI64bits.log 70.2405 CifarI64bits.log
run0012.sh Symmetric 69.4085 CifarI64bitsSymm.log 69.3631 CifarI64bitsSymm.log 69.3487 CifarI64bitsSymm.log
run0013.sh CIFAR-10 (II) 16 Asymmetric 62.8279 CifarII16bits.log 61.8231 CifarII16bits.log 62.5369 CifarII16bits.log
run0014.sh Symmetric 60.3927 CifarII16bitsSymm.log 59.5196 CifarII16bitsSymm.log 60.0741 CifarII16bitsSymm.log
run0015.sh 32 Asymmetric 64.0929 CifarII32bits.log 64.1100 CifarII32bits.log 63.1728 CifarII32bits.log
run0016.sh Symmetric 62.1983 CifarII32bitsSymm.log 62.4287 CifarII32bitsSymm.log 61.4763 CifarII32bitsSymm.log
run0017.sh 64 Asymmetric 65.0706 CifarII64bits.log 63.8214 CifarII64bits.log 64.6805 CifarII64bits.log
run0018.sh Symmetric 63.8469 CifarII64bitsSymm.log 62.8956 CifarII64bitsSymm.log 63.2863 CifarII64bitsSymm.log
run0019.sh NUS-WIDE 16 Asymmetric 76.3282 Nuswide16bits.log 78.1548 Nuswide16bits.log 78.8492 Nuswide16bits.log
run0020.sh Symmetric 75.8496 Nuswide16bitsSymm.log 77.0711 Nuswide16bitsSymm.log 78.0642 Nuswide16bitsSymm.log
run0021.sh 32 Asymmetric 82.1629 Nuswide32bits.log 82.1288 Nuswide32bits.log 82.3119 Nuswide32bits.log
run0022.sh Symmetric 81.1774 Nuswide32bitsSymm.log 81.1331 Nuswide32bitsSymm.log 81.2273 Nuswide32bitsSymm.log
run0023.sh 64 Asymmetric 83.0987 Nuswide64bits.log 83.0466 Nuswide64bits.log 83.0686 Nuswide64bits.log
run0024.sh Symmetric 82.0026 Nuswide64bitsSymm.log 82.2323 Nuswide64bitsSymm.log 82.2421 Nuswide64bitsSymm.log

4. References

If you find this code useful or use the toolkit in your work, please consider citing:

@inproceedings{wang22mecoq,
  author={Wang, Jinpeng and Zeng, Ziyun and Chen, Bin and Dai, Tao and Xia, Shu-Tao},
  title={Contrastive Quantization with Code Memory for Unsupervised Image Retrieval},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2022}
}

5. Acknowledgements

Our code is based on the implementation of PyTorch SimCLR, MoCo, DCL, Deep-Unsupervised-Image-Hashing and CIBHash.

6. Contact

If you have any question, you can raise an issue or email Jinpeng Wang (wjp20@mails.tsinghua.edu.cn). We will reply you soon.

About

The code for the paper "Contrastive Quantization with Code Memory for Unsupervised Image Retrieval" (AAAI'22, Oral).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published