✨✨✨ Version V2 updates the code structure to make it more user-friendly for users to customize their own datasets, model structures, and training processes.
The latest weight of USFM has been released (USFM_latest.pth).
-
USFM is the first foundation model for medical ultrasound images, developed and maintained by the Laboratory of Medical Imaging and Artificial Intelligence, Fudan University.
-
USFM aims to accelerate the modeling of existing medical ultrasound image analysis tasks with high performance and efficiency (less labeled data and fewer training epochs).
-
The superior capability of USFM comes from unsupervised pre-training on a large multi-organ, multi-center, multi-device ultrasound database, which contains two million ultrasound images from different ultrasound devices around the globe, which guarantees the generalizability and versatility of USFM.
-
To adapt to the characteristics of ultrasound images, the unsupervised pre-training of the USFM is based on Mask Image Modeling (MIM) with the addition of frequency domain mask learning, which captures the image texture features well.
-
Experiments validate the excellent performance and labeling efficiency of USFM on common disease classification, tissue segmentation and image enhancement tasks. More tasks are in progress.
# clone project
git clone https://github.com/openmedlab/USFM.git
cd USFM
# [OPTIONAL] create conda environment
conda create -n USFM python=3.9
conda activate USFM
# install pytorch according to instructions
# https://pytorch.org/get-started/
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu118
# install requirements
pip install -r requirements.txt
# install mmcv
pip install mmcv==2.2.0 -f https://download.openmmlab.com/mmcv/dist/cu118/torch2.4/index.html
# install mmsegmentation [important: from modified mmseg]
mkdir -p useful_modules
cd useful_modules
git clone git@github.com:George-Jiao/mmsegmentation.git
cd mmsegmentation
git checkout gj_mmcv2_2_0
pip install -v -e .
cd ../..
usdsgen is a USFM-based ultrasound downstream task generalization package that can be used for downstream tasks on ultrasound images.
pip install -v -e .
You can save datasets in either folder, the default is the folder [datasets].
The folder format is generally:
datasets/
├── Seg/
├── dataset_names/
├── trainning_set/
├── image/ img1.png..
├── mask/ img1.png..
├── val_set/
├── image/
├── mask/
├── test_set/
├── image/
├── mask/
|── Cls/
├── dataset_names/
├── trainning_set/
|── class1/
|── class2/
├── val_set/
|── class1/
|── class2/
├── test_set/
|── class1/
|── class2/
**** Advanced: data configuration in folder [configs/data/]
Download the Seg_toy_dataset from Google Drive Seg_toy_dataset.tar.gz and save it in folder [./datasets].
** The toy dataset is just for running purposes only, containing 199 images for training, 50 images for validation, and 50 images for testing.
mkdir -p ./datasets/Seg/
tar -xzvf ./datasets/Seg_toy_dataset.tar.gz -C ./datasets/Seg/
Download the USFM weight from Google Drive USFM_latest.pth and save it in [./assets/FMweight/USFM_latest.path].
# setting the environment variable
export batch_size=16
export num_workers=4
export CUDA_VISIBLE_DEVICES=0,1,2
export devices=3 # number of GPUs
export dataset=toy_dataset
export epochs=400
export pretrained_path=./assets/FMweight/USFM_latest.pth
export task=Seg # Cls for classification, Seg for segmentation
export model=Seg/SegVit # SegVit or Upernet for segmentation, vit for classification
# Segmentation task
python main.py experiment=task/$task data=Seg/$dataset data="{batch_size:$batch_size,num_workers:$num_workers}" \
model=$model model.model_cfg.backbone.pretrained=$pretrained_path \
train="{epochs:$epochs, accumulation_steps:1}" L="{devices:$devices}" tag=USFM
# Classification task
export task=Cls
export model=Cls/vit
python main.py experiment=task/$task data=Cls/$dataset data="{batch_size:$batch_size,num_workers:$num_workers}" \
model=$model model.model_cfg.backbone.pretrained=$pretrained_path \
train="{epochs:$epochs, accumulation_steps:1}" L="{devices:$devices}" tag=USFM
The results of the experiment are saved in the logs/fineturne folder.
graph TD;
endpoint[main.py] --> trainers[usdsgen/trainer/ <br/> configs/experiment/]
trainers[usdsgen/trainer/ <br/> configs/experiment/] --> data[usdsgen/data/ <br/> configs/data/]
trainers[usdsgen/trainers.py <br/> configs/experiment/] --> model[usdsgen/model/ <br/> configs/model/]
**** You can conveniently configure different models, datasets, and training processes in the usdsgen and configs folder.
This project is under the CC-BY-NC 4.0 license. See LICENSE for details.
Our code is based on transformer, pytorch-image-models , and lightning-hydra-template . Thanks them for releasing their codes.
Have a question? Found a bug? Missing a specific feature? Feel free to file a new issue, discussion or PR with respective title and description.
Please perform a code check before committing with the pre-commit hooks.
# pip install pre-commit
pre-commit install
pre-commit run -a
If you find the USFM or this project useful in your research, please consider cite:
@article{JIAO2024103202,
title = {USFM: A universal ultrasound foundation model generalized to tasks and organs towards label efficient image analysis},
journal = {Medical Image Analysis},
volume = {96},
pages = {103202},
year = {2024},
issn = {1361-8415},
doi = {https://doi.org/10.1016/j.media.2024.103202},
url = {https://www.sciencedirect.com/science/article/pii/S1361841524001270},
author = {Jing Jiao, Jin Zhou, Xiaokang Li, ..., Yuanyuan Wang and Yi Guo},
keywords = {Ultrasound image, Foundation model, Label efficiency, Task adaptability},
}