Skip to content

Official PyTorch implementation of the TMI paper "Nucleus-aware Self-supervised Pretraining Using Unpaired Image-to-image Translation for Histopathology Image"

Notifications You must be signed in to change notification settings

zhiyuns/UNITPathSSL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UNITPathSSL: Nucleus-aware Self-supervised Pretraining Using Unpaired Image-to-image Translation for Histopathology Image

Official PyTroch implementation of the TMI 2023 paper
[Paper] [PDF]

Nucleus-aware Self-supervised Pretraining Using Unpaired Image-to-image Translation for Histopathology Image
Zhiyun Song , Penghui Du , Junpeng Yan , Kailu Li , Jianzhong Shou , Maode Lai , Yubo Fan , Yan Xu

Abstract: Self-supervised pretraining attempts to enhance model performance by obtaining effective features from unlabeled data, and has demonstrated its effectiveness in the field of histopathology images. Despite its success, few works concentrate on the extraction of nucleus-level information, which is essential for pathologic analysis. In this work, we propose a novel nucleus-aware self-supervised pretraining framework for histopathology images. The framework aims to capture the nuclear morphology and distribution information through unpaired image-to-image translation between histopathology images and pseudo mask images. The generation process is modulated by both conditional and stochastic style representations, ensuring the reality and diversity of the generated histopathology images for pretraining. Further, an instance segmentation guided strategy is employed to capture instance-level information. The experiments on 7 datasets show that the proposed pretraining method outperforms supervised ones on Kather classification, multiple instance learning, and 5 dense-prediction tasks with the transfer learning protocol, and yields superior results than other self-supervised approaches on 8 semi-supervised tasks.

Release notes

This repository depends on StyleGAN2-ADA-PyTorch, and Detectron2 for simultaneously generating and segmenting nucleus in pathology images.

Requirements

  • We recommend Linux for performance and compatibility reasons.
  • 1–8 high-end NVIDIA GPUs with at least 12 GB of memory.
  • CUDA toolkit 10.2 or later. Use at least version 11.1 if running on RTX 3090. (Why is a separate CUDA toolkit installation required? See comments in #2.)
  • 64-bit Python 3.7 and PyTorch 1.8.0. See https://pytorch.org/ for PyTorch install instructions. One way to install Pytorch 1.8.0, which has been verify by the author, is: conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=10.2 -c pytorch
  • StyleGAN2-ADA-PyTorch dependencies: pip install click requests tqdm pyspng ninja psutil scipy shapely opencv-python imageio-ffmpeg==0.4.3 according to StyleGAN2-ADA-PyTorch and our requirements.
  • Detectron2 dependencies: you need to first cd ../ and python -m pip install -e UNITPathSSL according to Detectron2_installation.

Preparing datasets

All of the datasets for training should be placed under CYCLE.DATASET_ROOT defined in config-file. The whole structure need to like this:

Path Description
CYCLE.DATASET_ROOT Main directory of datasets
  └  CYCLE.DATASET_A_NAME Generated mask data root directory
    ├  cell_mask Instance segmentation ground truth(.npy format) of generated mask data
    └  CYCLE.DATASET_A_SCALE Specified scale of generated mask data(.png format)
  └  CYCLE.DATASET_B_NAME Real pathology
    └  CYCLE.DATASET_B_SCALE Specified scale of real pathology(.png format)

Datasets for testing segmentation performance should be prepared according to Detectron2-Use Builtin Datasets or Use Custom Datasets. For our setting, we recommend registering a COCO Format Dataset, which means adjusting gt_json_file and gt_path in train.py to your path of groung truth.

Getting started

We provide checkpoints used in the transfer learning experiments in our paper via Google Drive. Please download the checkpoints from the URL: https://drive.google.com/drive/folders/1Pg2YRjyIxCPBsL7VVFZIaiVCB0iWKLJq?usp=sharing

Then, you can reproduce the results in our paper. For example, you can train the Panoptic FPN on the Kumar dataset with:

python tools/train_net.py --config-file ./configs/ours/seg/kumar/proposed_panoptic.yaml 

Moreover, you can use the pretrained backbone to extract the features for various tasks, such as classification, MIL, and calculation of fidelity metrics (such as FID). An example for extracting the features is provided in 'tools/classify_demo.ipynb'.

Citation

@ARTICLE{UNITPATHSSL,
  author={Song, Zhiyun and Du, Penghui and Yan, Junpeng and Li, Kailu and Shou, Jianzhong and Lai, Maode and Fan, Yubo and Xu, Yan},
  journal={IEEE Transactions on Medical Imaging}, 
  title={Nucleus-aware Self-supervised Pretraining Using Unpaired Image-to-image Translation for Histopathology Images}, 
  year={2024},
  volume={43},
  number={1},
  pages={459-472},
  doi={10.1109/TMI.2023.3309971}}

Development

This is a research reference implementation and is treated as a one-time code drop. As such, we do not accept outside code contributions in the form of pull requests.

About

Official PyTorch implementation of the TMI paper "Nucleus-aware Self-supervised Pretraining Using Unpaired Image-to-image Translation for Histopathology Image"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published