This project focuses on EDA and classifying date fruits based on extracted features from their images. The dataset used contains 34 features, which are utilized for implementing various machine learning models, including logistic regression, random forest, k-neighbors, XGBoost, and a simple neural network. The models are trained and tested on the dataset, and their performance is evaluated based on accuracy, precision, recall, and F1-score. The best model is then selected based on the evaluation metrics.
The dataset used in this project is the Date Fruit Dataset from Kaggle. Total of 34 features are there in the dataset including morphological features, shape and color features, and texture features.
- Logistic Regression
- Random Forest
- K-Neighbors
- XGBoost
- Simple Neural Network (MLP)
Among all the models, Logistic Regression performed the best with an precison of 0.94, recall of 0.95, and F1-score of 0.94