Let's look into some more advanced concepts.
Convolutional Neural Network (CNN)-based models can be made more transparent by visualizing the regions of input that are "important" for predictions from these models - or visual explanations. Gradient-weighted Class Activation Mapping (Grad-CAM), uses the class-specific gradient information flowing into the final convolutional layer of a CNN to produce a coarse localization map of the important regions in the image.
Gradient-weighted Class Activation Mapping (GradCAM) uses the gradients of any target concept (say logits for 'dog' or even a caption), flowing into the final convolutional layer to produce a coarse localization map highlighting the important regions in the image for predicting the concept. We take the final convolutional feature map, and then we weigh every channel in that feature with the gradient of the class with respect to the channel. It tells us how intensely the input image activates different channels by how important each channel is with regard to the class. It does not require any re-training or change in the existing architecture.
- Train for 40 Epochs
- Display 20 misclassified images
- Display 20 GradCam output on the SAME misclassified images
- Apply the following transforms while training:
- RandomCrop(32, padding=4)
- CutOut(16x16)
- Rotate(±5°)
- Must use ReduceLROnPlateau
- Must use LayerNormalization ONLY
- Model: ResNet18
- Total Train data: 60,000 | Total Test Data: 10,000
- Total Parameters: 11,173,962
- Test Accuracy: 90.03%
- Epochs: Run till 40 epochs
- Normalization: Layer Normalization
- Regularization: L2 with factor 0.0001
- Optimizer: Adam with learning rate 0.001
- Loss criterion: Cross Entropy
- Scheduler: ReduceLROnPlateau
- Albumentations:
- RandomCrop(32, padding=4)
- CutOut(16x16)
- Rotate(5 degree)
- CoarseDropout
- Normalization
- Misclassified Images: 1104 images were misclassified out of 10,000
-
resnet.py: This describes the ResNet-18 architecture with Layer Normalization
Referrence: https://github.com/kuangliu/pytorch-cifar/blob/master/models/resnet.py -
utils: Utils code contains the following components:-
- Data Loaders
- Albumentations
- Accuracy Plots
- Misclassification Image Plots
- Seed
-
main.py: Main code contains the following functions:-
- Train code
- Test code
- Main function for training and testing the model
-
Colab file: The Google Colab file contains the following steps:-
- Cloning the GIT Repository
- Loading data calling the data loader function from utils file
- Model Summary
- Running the model calling the main file
- Plotting Accuracy Plots
- Plotting 20 Misclassification Images
- Plotting the Gradcam for same 20 misclassified images
Abhiram Gurijala
Arijit Ganguly
Rohin Sequeira