Segmentation-WSI-andU-Net-Manipulation

Objectives

H I 😀

This script is provided for part 1 & part 2 and tile and Augmentation Preprocessing.

The description of project is provided as Segmentation: WSI and U-Net Pipeline Manipulation.

In summary, the mask of WSI image is provided in 'annotation.csv' with a specific format. The steps are as follows:

Extracting ROI info from csv file,
Preprocessing the data to reach the desired mask win downsample version (using pyvips) (reaching to the state in which you can select tissues of you want to have their specific mask)
Tile preprocessing to provide image patches for train validation and augmentation of data(using pyvips)
Conducting train and validation, seeing some random predictions and applying error analysis to improve

Having gained a better understanding of its functionality, I believe it would be applicable to focus on a specific class of annotation terms or a selected tissue for further refinement. Additionally, to achieve superior results, I've prepared a pilot format for fine-tuning SMP models using predefined weights. This can be applied to an augmented dataset generated by me from tiles of the original image, especially when ample image data is available and can be processed on a robust server.

The first part of the project is to prepare the data with its label mask which results from geometry data that results from 'annotations.csv' file and the main image 'm9de8lfp.tif'. I have done it by extracting geometry data from the original WSI image and using 'pylibs' and 'shapely' libraries. the mask is obtained as follows:

For a better representation of the extracted mask from 'annotation.csv' file, you can see the next image which is in smaller shape:

The next step is preparing the mask data for image segmentation. To reach the mask as ground truth with need to have a binary mask. In addition for segmentation, we can have 2 approaches. the first one could be related to annotation terms and we can use the original mask, the second one is for tissue segmentation which is mentioned in part 2. in the following, you can see the binary masks, registered masks, and each tissue mask separately which is prepared for further processing if asked.

For part 3, tiles action is applied to have patches for training and validation: you can find some random tiles with their masks in the next plot of tiles:

The last step is applying segmentation using smp by pytorch. As my computational source and data are limited, (epochs = 15 and various augmentation method are scripted as comments and is tested to be applicable if needed) I have just designed a prototype version and you can see the result of applying a pre-trained model with 'imagenet' weights in the next images. The validation loss plot is also obtained. ( I have tried smp model without applying pretrained weights and results are not well enough in low values of epochs).

Lastly, I want to mention that consider this as a prototype and it could be modified with the objective and data and suited computational source to test. You can also download the prototype version of the fine-tuned pre-trained model with 15 epochs with the following link: https://drive.google.com/file/d/1mlq5dmFtXn0TiE-1S-_iE6aD9xQ8O8fX/view?usp=sharing

Updates and comments: I augmented data in 3 ways as there would be a higher cost in terms of time and resources if I want to apply various augmentation methods with strides that produce more images and there can be an improvement in image segmentation (binary classification) of tissues ROI, and in the following I am providing more images. However, as the data diversity is not high enough, we may have some defects in the cost function of the validation set. This issue can be solved by applying more diverse WSI for training and validation, and also by applying more augmented data which needs more computational sources for conducting trial and error on training and validation in order to optimize. (considering dropout if needed)

The updated version of fine-tuned pre-trained in 30 epochs has the cost function as follows which is not stable in some epochs that might have root in not enjoying diverse data both in training and testing and might have been overfitted somehow in training set.

some result images of this updated algorithm from test data:

and there is another update: (the picks are in cost function might be solved by more diversified data for training and validating. tile_size = 512 stride = 256 171 images for train: 45 for validation: Total tiles: 216 augmentations = [HorizontalFlip(p=1), ElasticTransform(p=1)]

For the next step, in my opinion, designing a pipeline to apply segmentation for tissue ROI is the first step. Having reached the desired algorithm which can be reliable enough for this objective, the next is the segmentation via annotation terms that have been provided in 'annotation.csv' file to segment each selected tissue into 5 classes as follows: 'Dermal component of melanoma' 'Intra-epidermal component of melanoma' 'Normal dermis' 'Normal sub-cutaneous tissue' 'Normal epidermis (with papillary dermis)'

Regards, M Najafi

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Applying_SMP_Task3.ipynb		Applying_SMP_Task3.ipynb
Applying_SMP_Task3.pdf		Applying_SMP_Task3.pdf
Preprocessing_Task1_2.ipynb		Preprocessing_Task1_2.ipynb
Preprocessing_Task1_2.pdf		Preprocessing_Task1_2.pdf
README.md		README.md
images&results.rar		images&results.rar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Segmentation-WSI-andU-Net-Manipulation

Objectives

About

Releases

Packages

Languages

Moh-najafi/Segmentation-WSI-andU-Net-Manipulation

Folders and files

Latest commit

History

Repository files navigation

Segmentation-WSI-andU-Net-Manipulation

Objectives

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages