Skip to content

Moh-najafi/Segmentation-WSI-andU-Net-Manipulation

Repository files navigation

Segmentation-WSI-andU-Net-Manipulation

Objectives

H I 😀

This script is provided for part 1 & part 2 and tile and Augmentation Preprocessing.

The description of project is provided as Segmentation: WSI and U-Net Pipeline Manipulation.

In summary, the mask of WSI image is provided in 'annotation.csv' with a specific format. The steps are as follows:

  • Extracting ROI info from csv file,
  • Preprocessing the data to reach the desired mask win downsample version (using pyvips) (reaching to the state in which you can select tissues of you want to have their specific mask)
  • Tile preprocessing to provide image patches for train validation and augmentation of data(using pyvips)
  • Conducting train and validation, seeing some random predictions and applying error analysis to improve

Having gained a better understanding of its functionality, I believe it would be applicable to focus on a specific class of annotation terms or a selected tissue for further refinement. Additionally, to achieve superior results, I've prepared a pilot format for fine-tuning SMP models using predefined weights. This can be applied to an augmented dataset generated by me from tiles of the original image, especially when ample image data is available and can be processed on a robust server.

The first part of the project is to prepare the data with its label mask which results from geometry data that results from 'annotations.csv' file and the main image 'm9de8lfp.tif'. I have done it by extracting geometry data from the original WSI image and using 'pylibs' and 'shapely' libraries. the mask is obtained as follows:

extracted mask

For a better representation of the extracted mask from 'annotation.csv' file, you can see the next image which is in smaller shape:

m9de8lfp_MASK_new_64

The next step is preparing the mask data for image segmentation. To reach the mask as ground truth with need to have a binary mask. In addition for segmentation, we can have 2 approaches. the first one could be related to annotation terms and we can use the original mask, the second one is for tissue segmentation which is mentioned in part 2. in the following, you can see the binary masks, registered masks, and each tissue mask separately which is prepared for further processing if asked.

filled morphed mask filled morphed registered

labeling different region each tissue registered

For part 3, tiles action is applied to have patches for training and validation: you can find some random tiles with their masks in the next plot of tiles: random tiles

The last step is applying segmentation using smp by pytorch. As my computational source and data are limited, (epochs = 15 and various augmentation method are scripted as comments and is tested to be applicable if needed) I have just designed a prototype version and you can see the result of applying a pre-trained model with 'imagenet' weights in the next images. The validation loss plot is also obtained. ( I have tried smp model without applying pretrained weights and results are not well enough in low values of epochs).

val_loss of pretrained model

pretrained model result5 pretrained model result2 pretrained model result4

pretrained model result3

Lastly, I want to mention that consider this as a prototype and it could be modified with the objective and data and suited computational source to test. You can also download the prototype version of the fine-tuned pre-trained model with 15 epochs with the following link: https://drive.google.com/file/d/1mlq5dmFtXn0TiE-1S-_iE6aD9xQ8O8fX/view?usp=sharing

Updates and comments: I augmented data in 3 ways as there would be a higher cost in terms of time and resources if I want to apply various augmentation methods with strides that produce more images and there can be an improvement in image segmentation (binary classification) of tissues ROI, and in the following I am providing more images. However, as the data diversity is not high enough, we may have some defects in the cost function of the validation set. This issue can be solved by applying more diverse WSI for training and validation, and also by applying more augmented data which needs more computational sources for conducting trial and error on training and validation in order to optimize. (considering dropout if needed)

The updated version of fine-tuned pre-trained in 30 epochs has the cost function as follows which is not stable in some epochs that might have root in not enjoying diverse data both in training and testing and might have been overfitted somehow in training set. image

some result images of this updated algorithm from test data: image image image

and there is another update: (the picks are in cost function might be solved by more diversified data for training and validating. tile_size = 512 stride = 256 171 images for train: 45 for validation: Total tiles: 216 augmentations = [HorizontalFlip(p=1), ElasticTransform(p=1)]

image

6 6 6_26_4 7 9

For the next step, in my opinion, designing a pipeline to apply segmentation for tissue ROI is the first step. Having reached the desired algorithm which can be reliable enough for this objective, the next is the segmentation via annotation terms that have been provided in 'annotation.csv' file to segment each selected tissue into 5 classes as follows: 'Dermal component of melanoma' 'Intra-epidermal component of melanoma' 'Normal dermis' 'Normal sub-cutaneous tissue' 'Normal epidermis (with papillary dermis)'

Regards, M Najafi

About

WSI Segmentation project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published