- Description: This project focuses on predicting sleep efficiency using machine learning. Sleep efficiency is a critical metric for understanding the quality of sleep, and the goal of this project is to develop a model that can accurately predict sleep efficiency based on various demographic and lifestyle factors.
This project utilizes machine learning to predict sleep efficiency based on a comprehensive dataset encompassing various sleep-related parameters and lifestyle choices. The goal is to offer personalized insights into sleep patterns and contribute to the field of sleep science.
- Explored and cleaned a diverse dataset containing sleep patterns, lifestyle choices, and demographic information.
- Applied one-hot encoding to non-numeric columns like gender and smoking status.
- Extracted the hour component from date-time columns to simplify the analysis.
- Performed one-hot encoding for gender and smoking status.
- Identified and handled outliers using the Interquartile Range (IQR) method.
- Visualized outliers through box plots for various features.
- Utilized Random Forest, LightGBM, AdaBoosted LightGBM, and Linear Regression for model comparison.
- Selected Random Forest as the best-performing model based on Mean Squared Error.
- Fine-tuned Random Forest hyperparameters using GridSearchCV.
- Employed Min-Max scaling to normalize the dataset.
- Evaluated model performance using Mean Squared Error.
- Validated predictions through violin plots comparing actual vs. predicted values.
- Visualized feature importance using bar plots.
- Execuate the Forecasting_Sleep_Efficiency_Random_Forest.pynb file sequentially
- Python 3.x
- Jupyter Notebooks
- Libraries: pandas, matplotlib, seaborn, scikit-learn, lightgbm
Hoyath Ali
- Dataset source: Kaggle
Feel free to reach out for any questions or feedback!