D-MASTER: Mask Annealed Transformer for Unsupervised Domain Adaptation in Breast Cancer Detection from Mammograms
- [July 2024] We publicly release source code and pre-trained D-MASTER model weights!
- [Jun 2024] D-MASTER is accepted in MICCAI 2024 Congratulations to all the authors. See you all at MICCAI 2024 under the Moroccan sun!
- [June 2024] We released an arxiv version.. See more details in our updated arxiv!
- [June 2024] We release RSNA-BSD1K Dataset, a bounding box annotated subset of 1000 mammograms from the RSNA Breast Screening Dataset (referred to as RSNA-BSD1K) to support further research in BCDM!
- [May 2024] We release the D-MASTER benchmark.
D-MASTER is a transformer-based Domain-invariant Mask Annealed Student Teacher Autoencoder Framework for cross-domain breast cancer detection from mammograms (BCDM). It integrates a novel mask-annealing technique and an adaptive confidence refinement module. Unlike traditional pretraining with Mask Autoencoders (MAEs) that leverage massive datasets before fine-tuning on smaller datasets, D-MASTER introduces a novel learnable masking technique for the MAE branch. This technique generates masks of varying complexities, which are then reconstructed by the DefDETR encoder and decoder. By applying this self-supervised task on target images, our approach enables the encoder to acquire domain-invariant features and improve target representations.
π₯ Check out our website for more overview!
RSNA-BSD1K is a bounding box annotated subset of 1,000 mammograms from the RSNA Breast Screening Dataset, designed to support further research in breast cancer detection from mammograms (BCDM). The original RSNA dataset consists of 54,706 screening mammograms, containing 1,000 malignancies from 8,000 patients. From this, we curated RSNA-BSD1K, which includes 1,000 mammograms with 200 malignant cases, annotated at the bounding box level by two expert radiologists.
π₯ Check out our released Dataset for more details!
- Structure
- ββ rsna-bsd1k
ββ annotations
ββ instances_full.json
ββ instances_val.json
ββ images
ββ train
ββ val
-
Put the dataset in the
DATA_ROOT
folder. -
Add rsna dataset in datasets/coco_style_dataset.py.
-
Done! You can now use the dataset for training and evaluation.
-
Linux, CUDA >= 11.1, GCC >= 8.4
-
Python >= 3.8
-
torch >= 1.10.1, torchvision >= 0.11.2
-
Other requirements
pip install -r requirements.txt
cd ./models/ops
sh ./make.sh
# unit test (should see all checking is True)
python test.py
We provide the 2 benchmarks in our paper:
- city2foggy: cityscapes dataset is used as source domain, and foggy_cityscapes(0.02) is used as target domain.
- sim2city: sim10k dataset is used as source domain, and cityscapes which only record AP of cars is used as target domain.
- city2bdd: cityscapes dataset is used as source domain, and bdd100k-daytime is used as target domain.
You can download the raw data from the official websites: cityscapes, foggy_cityscapes, sim10k, bdd100k. We provide the annotations that are converted into coco style, download from here and organize the datasets and annotations as follows:
[data_root]
ββ inbreast
ββ annotations
ββ instances_train.json
ββ instances_val.json
ββ images
ββ train
ββ val
ββ ddsm
ββ annotations
ββ instances_train.json
ββ instances_val.json
ββ images
ββ train
ββ val
ββ rsna-bsd1k
ββ annotations
ββ instances_full.json
ββ instances_val.json
ββ images
ββ train
ββ val
ββ cityscapes
ββ annotations
ββ cityscapes_train_cocostyle.json
ββ cityscapes_train_caronly_cocostyle.json
ββ cityscapes_val_cocostyle.json
ββ cityscapes_val_caronly_cocostyle.json
ββ leftImg8bit
ββ train
ββ val
ββ foggy_cityscapes
ββ annotations
ββ foggy_cityscapes_train_cocostyle.json
ββ foggy_cityscapes_val_cocostyle.json
ββ leftImg8bit_foggy
ββ train
ββ val
ββ sim10k
ββ annotations
ββ sim10k_train_cocostyle.json
ββ sim10k_val_cocostyle.json
ββ JPEGImages
ββ bdd10k
ββ annotations
ββ bdd100k_daytime_train_cocostyle.json
ββ bdd100k_daytime_val_cocostyle.json
ββ JPEGImages
To use additional datasets, you can edit datasets/coco_style_dataset.py and add key-value pairs to CocoStyleDataset.img_dirs
and CocoStyleDataset.anno_files
.
As has been discussed in implementation details in the paper, to save computation cost, our method is designed as a three-stage paradigm. We first perform source_only
training which is trained standardly by labeled source domain. Then, we perform cross_domain_mae
to train the model with MAE branch. Finally, we perform teaching
which utilize a teacher-student framework with MAE branch and selective retraining.
For example, for ddsm2inbreast
benchmark, first edit the files in configs/def-detr-base/ddsm2inbreast/
to specify your own DATA_ROOT
and OUTPUT_DIR
, then run:
sh configs/def-detr-base/ddsm2inbreast/source_only.sh
sh configs/def-detr-base/ddsm2inbreast/cross_domain_mae.sh
sh configs/def-detr-base/ddsm2inbreast/teaching.sh
We use tensorboard
to record the loss and results. Run the following command to see the curves during training:
tensorboard --logdir=<YOUR/LOG/DIR>
To evaluate the trained model and get the predicted results, run:
sh configs/def-detr-base/city2foggy/evaluation.sh
If the model is adapated on a classification dataset, the predictions produced during inference will be stored in ./outputs/outputs.csv
file. To generate predictions set --csv True
in the evalution.sh script and run:
sh configs/def-detr-base/mammo/evaluation.sh
The ./outputs/outputs.csv
file can be used further for computing the required metrics for the target classification dataset on which the model was adapted. Then Run
python match_id_csv_json.py
Finally Run
python eval_cview_csv.py
This will give you the TN, TP, FN, FP, AUC, and NPV score,
We conduct all experiments with batch size 8 (for source_only stage, 8 labeled samples; for cross_domain_mae and MRT teaching stage, 8 labeled samples and 8 unlabeled samples), on 4 NVIDIA A100 GPUs.
inhouse2inbreast: Inhouse β INBreast
backbone | encoder layers | decoder layers | training stage | R@0.3 | logs & weights |
---|---|---|---|---|---|
resnet50 | 6 | 6 | source_only | 64.3 | logs & weights |
resnet50 | 6 | 6 | cross_domain_mae | 67.3 | logs & weights |
resnet50 | 6 | 6 | MRT teaching | 71.9 | logs & weights |
inhouse2rsna: Inhouse β RSNA-BSD1K
backbone | encoder layers | decoder layers | training stage | R@0.3 | logs & weights |
---|---|---|---|---|---|
resnet50 | 6 | 6 | source_only | 53.2 | logs & weights |
resnet50 | 6 | 6 | cross_domain_mae | 54.6 | logs & weights |
resnet50 | 6 | 6 | MRT teaching | 58.7 | logs & weights |
ddsm2inhouse: DDSM β Inhouse
backbone | encoder layers | decoder layers | training stage | R@0.3 | logs & weights |
---|---|---|---|---|---|
resnet50 | 6 | 6 | source_only | 29.6 | logs & weights |
resnet50 | 6 | 6 | cross_domain_mae | 31.1 | logs & weights |
resnet50 | 6 | 6 | MRT teaching | 33.7 | logs & weights |
ddsm2inbreast: DDSM β INBreast
backbone | encoder layers | decoder layers | training stage | R@0.3 | logs & weights |
---|---|---|---|---|---|
resnet50 | 6 | 6 | source_only | 29.6 | logs & weights |
resnet50 | 6 | 6 | cross_domain_mae | 31.1 | logs & weights |
resnet50 | 6 | 6 | MRT teaching | 33.7 | logs & weights |
city2foggy: cityscapes β foggy cityscapes(0.02)
backbone | encoder layers | decoder layers | training stage | AP@50 | logs & weights |
---|---|---|---|---|---|
resnet50 | 6 | 6 | source_only | 29.5 | logs & weights |
resnet50 | 6 | 6 | cross_domain_mae | 35.8 | logs & weights |
resnet50 | 6 | 6 | MRT teaching | 51.2 | logs & weights |
sim2city: sim10k β cityscapes(car only)
backbone | encoder layers | decoder layers | training stage | AP@50 | logs & weights |
---|---|---|---|---|---|
resnet50 | 6 | 6 | source_only | 53.2 | logs & weights |
resnet50 | 6 | 6 | cross_domain_mae | 57.1 | logs & weights |
resnet50 | 6 | 6 | MRT teaching | 62.0 | logs & weights |
city2bdd: cityscapes β bdd100k(daytime)
backbone | encoder layers | decoder layers | training stage | AP@50 | logs & weights |
---|---|---|---|---|---|
resnet50 | 6 | 6 | source_only | 29.6 | logs & weights |
resnet50 | 6 | 6 | cross_domain_mae | 31.1 | logs & weights |
resnet50 | 6 | 6 | MRT teaching | 33.7 | logs & weights |
This repository is constructed and maintained by Tajamul Ashraf.
If you find our paper or project useful, please cite our work in the following BibTeX:
@article{ashraf2024dmastermaskannealedtransformer,
title={D-MASTER: Mask Annealed Transformer for Unsupervised Domain Adaptation in Breast Cancer Detection from Mammograms},
author={Tajamul Ashraf and Krithika Rangarajan and Mohit Gambhir and Richa Gabha and Chetan Arora},
year={2024},
eprint={2407.06585},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2407.06585},
}
Thanks for your attention.