Lung segmentation for chest X-Ray images with ResUNet and UNet. In addition, feature extraction and tuberculosis cases diagnosis has developed.
Computer Vision (CV) has a lot of applications in medical diagnosis:
- Dermatology
- Ophthakmology
- Histopathology
X-rays images are critical for the detection of lung cancer, pneumenia ... In this notebook you will learn:
- Data pre-processing and augmetation
- Preprocess images properly for the train, validation and test sets.
- Set-up neural networks to segment the images and make disease predictions on chest X-rays.
The dataset contains x-rays and corresponding masks. Some masks are missing so it is advised to cross-reference the images and masks. The OP had the following request: It is requested that publications resulting from the use of this data attribute the source (National Library of Medicine, National Institutes of Health, Bethesda, MD, USA and Shenzhen No.3 People’s Hospital, Guangdong Medical College, Shenzhen, China) and cite the following publications:
[1] Jaeger S, Karargyris A, Candemir S, Folio L, Siegelman J, Callaghan F, Xue Z, Palaniappan K, Singh RK, Antani S, Thoma G, Wang YX, Lu PX, McDonald CJ. Automatic tuberculosis screening using chest radiographs. IEEE Trans Med Imaging. 2014 Feb;33(2):233-45. doi: 10.1109/TMI.2013.2284099. PMID: 24108713
[2] Candemir S, Jaeger S, Palaniappan K, Musco JP, Singh RK, Xue Z, Karargyris A, Antani S, Thoma G, McDonald CJ. Lung segmentation in chest radiographs using anatomical atlases with nonrigid registration. IEEE Trans Med Imaging. 2014 Feb;33(2):577-90. doi: 10.1109/TMI.2013.2290491. PMID: 24239990
X-ray images in this data set have been acquired from the tuberculosis control program of the Department of Health and Human Services of Montgomery County, MD, USA. This set contains 138 posterior-anterior x-rays, of which 80 x-rays are normal and 58 x-rays are abnormal with manifestations of tuberculosis. All images are de-identified and available in DICOM format. The set covers a wide range of abnormalities, including effusions and miliary patterns. The data set includes radiology readings available as a text file.
- The dataset is made up of images and segmentated mask from two diffrent sources.
- There is a slight abnormality in naming convention of masks.
- Some images don't have their corresponding masks.
- Images from the Shenzhen dataset has apparently smaller lungs as compared to the Montgomery dataset.
Data augmentation in data analysis are techniques used to increase the amount of data by adding slightly modified copies of already existing data or newly created synthetic data from existing data. It acts as a regularizer and helps reduce overfitting when training a machine learning model. It is closely related to oversampling in data analysis.
- create contrast images(v1)
- create contrast images(v2)
- create noise images
U-Net is a convolutional neural network that was developed for biomedical image segmentation at the Computer Science Department of the University of Freiburg. The network is based on the fully convolutional network and its architecture was modified and extended to work with fewer training images and to yield more precise segmentations.
ResUNet, a semantic segmentation model inspired by the deep residual learning and UNet. An architecture that take advantages from both(Residual and UNet) models. Paper: https://arxiv.org/pdf/1711.10684.pdf
Video Explaination: https://youtu.be/BOoBWRTpaKk
Number of parameters | validation loss | |
---|---|---|
U-Net | 7,759,521 | 0.03169 |
Residual-Unet | 4,722,737 | 0.02861 |
- Exploratory data analysis
- Feature extraction
- Feed extracted features into MLP Neural Network for classification
Use pretrained InceptionResNetV2 for feature extraction.
precision | recall | f1-score | support | |
---|---|---|---|---|
Tuberculosis negative cases | 0.80 | 0.88 | 0.84 | 77 |
Tuberculosis positive cases | 0.88 | 0.80 | 0.84 | 83 |
accuracy | 0.84 | 0.84 | 0.84 | 160 |
Use pretrained ResNet50 for feature extraction.
precision | recall | f1-score | support | |
---|---|---|---|---|
Tuberculosis negative cases | 0.95 | 0.85 | 0.89 | 85 |
Tuberculosis positive cases | 0.85 | 0.95 | 0.89 | 75 |
accuracy | 0.89 | 0.89 | 0.89 | 160 |
Use pretrained VGG16 for feature extraction.
precision | recall | f1-score | support | |
---|---|---|---|---|
Tuberculosis negative cases | 0.90 | 0.90 | 0.90 | 84 |
Tuberculosis positive cases | 0.89 | 0.89 | 0.89 | 76 |
accuracy | 0.90 | 0.90 | 0.90 | 160 |