'CNN_Sorghum_Weed_Classifier' is an artificial intelligence (AI) based software that can differentiate a sorghum sampling image from its associated weeds images. This repository releases the source code for pre-processing, augmenting, and normalizing the 'SorghumWeedDataset_Classification' dataset. It also contains the code for training, validating, and testing the AI model using transfer learning. The reproducible code of the CNN_SorghumWeed_Classifier is also available at https://codeocean.com/capsule/1446799/tree
CNN_Sorghum_Weed_Classifier is constructed using 'SorghumWeedDataset_Classification,' a crop-weed research dataset. The dataset is cloned in the source code for further processing and model building. The following references relate to the dataset:
- First appeared at https://data.mendeley.com/datasets/4gkcyxjyss/1
- GitHub repository: https://github.com/JustinaMichael/SorghumWeedDataset_Classification.git
- Detailed description of the data acquisition process: https://www.sciencedirect.com/science/article/pii/S2352340923009678
Language: Python 3.10.12
Dependencies:
- Tensorflow: 2.14.0
- Scikit-learn: 1.2.2
- Seaborn: 0.12.2
- Matplotlib: 3.7.1
- Scipy: 1.11.3
- Numpy: 1.23.5
- Pandas: 1.5.3
The complete source code for pre-processing the dataset and creating the model is included in the interactive Python notebook "CNN_Sorghum_Weed_Classifier.ipynb."
'CNN_Sorghum_Weed_Classifier.ipynb' can be opened in the 'Google colaboratory' (or any other Jupyter Notebook environment). The runtime of the source code is configured to 'T4 GPU' to expedite the model training process.
The dataset is cloned from the respective GitHub repository using the following command:
!git clone https://github.com/JustinaMichael/SorghumWeedDataset_Classification.git
The necessary libraries and packages are installed followed by initializing the tuned hyper-parameter values. The data is augmented and normalized before building the model. The following code snippet augments and normalizes the training data:
train_datagen = ImageDataGenerator(rescale = 1./255,
rotation_range = 45,
width_shift_range = 0.3,
shear_range = 0.25,
zoom_range = 0.25,
height_shift_range = 0.3,
horizontal_flip = True,
brightness_range=(0.2, 0.9),
vertical_flip = True,
fill_mode = 'reflect')
Using transfer learning, the classifier is trained, validated, and tested on the following four pre-trained Convolutional Neural Network (CNN) models, whose codes are provided sequentially.
- VGG19
- MobileNetV2
- DenseNet201
- ResNet152V2
history = model.fit(x = training_set,
batch_size = batch_size,
epochs = epochs,
callbacks = cd,
validation_data = valid_set,
steps_per_epoch = len(training_set),
validation_steps = len(valid_set),
validation_batch_size = batch_size,
validation_freq = 1)
'EarlyStopping' is triggered by the following code, which prevents overfitting even after the model has been initialized for 50 training epochs:
es = EarlyStopping(monitor = "val_accuracy",
min_delta = 0.01,
patience = 5,
verbose = 1,
mode = 'auto')
The following code is used to evaluate each of the four models, and the results are compared. With the highest accuracy of 0.96 and a considerable loss of 0.4, DenseNet201 produced the best results out of the four models. The results are presented graphically for easy comprehension.
evaluate_test_data = model.evaluate(test_set)
This project is licensed under the APACHE LICENSE, VERSION 2.0.
Please give credit to the "SorghumWeedDataset_Classification" dataset if you find it useful and utilize it in your work by citing
Justina, Michael J., and M. Thenmozhi. "SorghumWeedDataset_Classification And SorghumWeedDataset_Segmentation Datasets For Classification, Detection, and Segmentation In Deep Learning." Data in Brief (2023): 109935
- Justina Michael. J
Google Scholar: https://scholar.google.com/citations?user=pEEzO14AAAAJ&hl=en&oi=ao
ORCID: https://orcid.org/0000-0001-8072-3230 - Dr. M. Thenmozhi
Google Scholar: https://scholar.google.com/citations?user=Es49w08AAAAJ&hl=en&oi=ao
ORCID: https://orcid.org/0000-0002-8064-5938