With the goal of improving practices, cutting costs, and minimizing risks, Artificial Intelligence has revolutionized the Agriculture Industry in recent years. The goal of this project is to build a neural network that can predict, using soil and environment characteristics, which crop would be most suited for production. There will be an evaluation of two different architectures: one with two hidden layers and another with five. To determine the best settings, a thorough hyperparameter tuning method will be used. Finally, both designs' performance will be evaluated to determine whether or not they can work in a crop recommendation system.
Ever since the introduction of precision agriculture, crop recommendation systems prove to be extremely beneficial in today's world. Precision agriculture is a farming management approach that enhances agricultural production sustainability by monitoring, measuring, and adapting to temporal and geographical variability. Accurate crop recommendations based on a variety of environmental, soil, and climatic characteristics are a key component of this optimization when traditional methods are labor-intensive and many times faulty.
This project aims to develop a robust neural network model capable of accurately predicting the best crops to cultivate in a given area. By harnessing the power of deep learning, we seek to provide farmers with valuable insights that can enhance their decision-making process and ultimately maximize crop yields.
-
Data Acquisition:
- Acquire the dataset from Kaggle, a popular platform for hosting datasets, ensuring its relevance to the crop recommendation task.
-
Data Preprocessing:
- Identify and handle missing values to maintain data integrity.
- Detect and address outliers using the Interquartile Range (IQR) method to prevent them from skewing the analysis.
-
Statistical Analysis:
- Visualize the distribution of data through histograms and boxplots to gain insights into its characteristics.
- Calculate summary statistics to understand the central tendency and dispersion of the data.
-
Data Normalization and Splitting:
- Normalize the input features to ensure uniformity and facilitate model convergence.
- Partition the dataset using the Holdout Method, allocating 70% for training and 30% for testing, while ensuring a balanced representation of each class in both sets.
-
Neural Network Model Architecture Definition:
- Design the architecture for two models: one with 2 hidden layers and another with 5 hidden layers, incorporating techniques such as batch normalization and dropout to enhance model generalization.
-
Hyperparameter Tuning:
- Specify the search space for hyperparameters, including parameters such as learning rate, batch size, and regularization strength.
- Employ k-Fold Cross-Validation (k = 5) to systematically explore hyperparameter combinations and identify the optimal configuration based on mean accuracy.
-
Model Selection:
- Evaluate and compare the performance of the two architectures based on the highest accuracy achieved during the hyperparameter tuning phase, selecting the most effective model for each architecture.
-
Metrics Evaluation:
- Assess the chosen models' performance on the testing set by making predictions and generating a comprehensive classification report, which includes metrics such as accuracy, precision, recall, and F1-score, providing a holistic view of their effectiveness in crop recommendation.
In this README, we present a detailed overview of the methodology used to develop and train our crop recommendation neural network, along with insights into hyperparameter tuning and the results achieved. Additionally, we provide instructions on how to use our model and invite contributions from the community to further enhance its capabilities.
To optimize the performance of our neural network models, we conducted an extensive hyperparameter tuning process. The table below outlines the search space for each parameter:
Parameter | Options |
---|---|
Batch Size | 10, 50, 100 |
Max Epochs | 10, 50, 100 |
Optimizer | Adam, Adadelta, Adagrad, Adamax |
Learning Rate | 0.0001, 0.001, 0.01 |
Activation Function | ReLU, LeakyReLU, RReLU |
Number of Neurons in Hidden Layers | 10, 40, 80 |
Criterion | CrossEntropyLoss |
Dropout Rate | 0.0, 0.2, 0.5 |
Weight Initialization | Xavier Uniform Initialization, Xavier Normal Initialization |
After thorough experimentation, the following insights were gathered:
- Larger learning rates tend to yield better average scores for both architectures across all optimizers.
- For the architecture with 2 hidden layers, increasing the batch size results in a lower average score, whereas the architecture with 5 hidden layers achieves its peak average score at a batch size of 50.
- The combination of Adam optimizer and RReLU activation function demonstrates superior performance on average for both architectures.
- Overall, the Adam optimizer yields better average scores for both architectures, with similar performance observed across all optimizers.
- While all activation functions perform relatively similarly for the architecture with 2 hidden layers, RReLU stands out for better average performance in the architecture with 5 hidden layers.
- Both architectures benefit from increased epochs, with optimizers and activation functions exhibiting better performance as the number of epochs increases.
- It's worth noting that the architecture with 2 hidden layers took approximately 10 hours to train for the entire experimentation process, while the architecture with 5 hidden layers required around 17 hours.
These findings provide valuable insights into the impact of hyperparameters on the performance and training time of our neural network models, guiding further optimization efforts.
Hyperparameter Tuning Results - 2 Hidden Layer Architecture
Parameter | Value |
---|---|
Batch Size | 100 |
Max Epochs | 100 |
Optimizer | Adamax |
Learning Rate | 0.01 |
Activation Function | LeakyReLU |
Number of Neurons | 40 |
Criterion | CrossEntropyLoss |
Dropout Rate | 0.2 |
Weight Initialization | Xavier Normal Initialization |
Hyperparameter Tuning Results - 5 Hidden Layer Architecture
Parameter | Value |
---|---|
Batch Size | 100 |
Max Epochs | 100 |
Optimizer | Adam |
Learning Rate | 0.01 |
Activation Function | LeakyReLU |
Number of Neurons | 40 |
Criterion | CrossEntropyLoss |
Dropout Rate | 0.0 |
Weight Initialization | Xavier Normal Initialization |
Hyperparameter Tuning | 2 Hidden Layer Architecture (%) | 5 Hidden Layer Architecture (%) |
---|---|---|
Accuracy | 98.7 | 98.4 |
Classification Report - 2 Hidden Layer Architecture
Class | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
apple | 1.00 | 1.00 | 1.00 | 30 |
banana | 1.00 | 1.00 | 1.00 | 30 |
black gram | 0.97 | 1.00 | 0.98 | 30 |
chickpea | 1.00 | 1.00 | 1.00 | 30 |
coconut | 1.00 | 1.00 | 1.00 | 30 |
coffee | 1.00 | 1.00 | 1.00 | 30 |
cotton | 0.97 | 0.97 | 0.97 | 30 |
grapes | 1.00 | 1.00 | 1.00 | 30 |
jute | 0.86 | 1.00 | 0.92 | 30 |
kidney beans | 0.97 | 1.00 | 0.98 | 30 |
lentil | 1.00 | 1.00 | 1.00 | 30 |
maize | 0.97 | 0.93 | 0.95 | 30 |
mango | 0.94 | 1.00 | 0.97 | 30 |
moth beans | 1.00 | 0.93 | 0.97 | 30 |
mung bean | 1.00 | 1.00 | 1.00 | 30 |
muskmelon | 1.00 | 1.00 | 1.00 | 30 |
orange | 1.00 | 1.00 | 1.00 | 30 |
papaya | 1.00 | 1.00 | 1.00 | 30 |
pigeon peas | 1.00 | 0.97 | 0.98 | 30 |
pomegranate | 1.00 | 1.00 | 1.00 | 30 |
rice | 1.00 | 0.83 | 0.91 | 30 |
watermelon | 1.00 | 1.00 | 1.00 | 30 |
Accuracy | 0.98 | 660 | ||
Macro Avg | 0.98 | 0.98 | 0.98 | 660 |
Weighted Avg | 0.98 | 0.98 | 0.98 | 660 |
Classification Report - 5 Hidden Layer Architecture
Class | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
apple | 1.00 | 1.00 | 1.00 | 30 |
banana | 1.00 | 1.00 | 1.00 | 30 |
black gram | 1.00 | 1.00 | 1.00 | 30 |
chickpea | 1.00 | 1.00 | 1.00 | 30 |
coconut | 0.97 | 1.00 | 0.98 | 30 |
coffee | 1.00 | 1.00 | 1.00 | 30 |
cotton | 0.97 | 0.97 | 0.97 | 30 |
grapes | 1.00 | 1.00 | 1.00 | 30 |
jute | 0.85 | 0.93 | 0.89 | 30 |
kidney beans | 1.00 | 1.00 | 1.00 | 30 |
lentil | 1.00 | 0.97 | 0.98 | 30 |
maize | 0.97 | 0.97 | 0.97 | 30 |
mango | 1.00 | 1.00 | 1.00 | 30 |
moth beans | 0.97 | 1.00 | 0.98 | 30 |
mung bean | 1.00 | 1.00 | 1.00 | 30 |
muskmelon | 1.00 | 1.00 | 1.00 | 30 |
orange | 1.00 | 0.97 | 0.98 | 30 |
papaya | 1.00 | 1.00 | 1.00 | 30 |
pigeon peas | 1.00 | 1.00 | 1.00 | 30 |
pomegranate | 1.00 | 1.00 | 1.00 | 30 |
rice | 0.93 | 0.83 | 0.88 | 30 |
watermelon | 1.00 | 1.00 | 1.00 | 30 |
Accuracy | 0.98 | 660 | ||
Macro Avg | 0.98 | 0.98 | 0.98 | 660 |
Weighted Avg | 0.98 | 0.98 | 0.98 | 660 |
Overall, the hyperparameters for both model architectures are performing well and are aligned with the general findings of the experimentation process. Both models reveal consistent high performance across most classes, with both of them achieving an accuracy of 0.98. However, there are some subtle variations in the metrics for certain classes that highlight differences in the model’s ability to accurately classify specific instances. Overall, while both models can accurately classify, selecting the most suitable model may depend on the specific priorities of the classification at hand. Both of them prove to be suitable for a crop recommendation system that utilizes AI.
code.ipynb
: This file contains the implementation of the methodology described in the README.model_train_save.ipynb
: Here, you'll find the code for training and saving the models with the optimized parameters obtained during hyperparameter tuning.experiments/
: This folder contains plots generated after the hyperparameter tuning process based on the results of the hyperparameter tuning process. Referenced files:cv_results_2_hidden_layers.xlsx
andcv_results_5_hidden_layers.xlsx
statistics/
: Inside this folder, you'll find plots that describe various statistical aspects of the dataset.Crop_recommendation.csv
: This dataset serves as the foundation for training and testing the crop recommendation models.saved_models/model2hidden.pth
andsaved_models/model5hidden.pth
: These are the saved models resulting from the training process, ready to be deployed and utilized.