This Python project analyzes crowd density data in Bhopal, India, and builds machine learning models to predict crowd density based on various factors like weather, time of day, and day of the week. It includes data preprocessing, visualization, model implementation, and evaluation
The provided code is a data analysis and machine learning project implemented in Python using libraries such as numpy, matplotlib, pandas, seaborn, and scikit-learn. The project aims to analyze crowd density data in Bhopal, India, and build machine learning models to predict crowd density based on various factors like weather, time of day, and day of the week.
The code begins with importing necessary libraries and loading the dataset from a CSV file (Bhopal_Crowd_Data.csv). It then proceeds to analyze the dataset by visualizing the distribution of crowd density by zone, weather, hour slot, and day of the week using box plots and bar plots.
Next, it preprocesses the data by encoding categorical variables using label encoding and one-hot encoding techniques. The dataset is split into training and testing sets for model training and evaluation.
Several machine learning models are then implemented and evaluated for crowd density prediction, including Support Vector Machine (SVM), Decision Tree, Multiple Linear Regression, and Random Forest.
The performance of each model is assessed using the R-squared metric, and the best-performing model is used to predict crowd density. Finally, hyperparameter tuning using Grid Search is demonstrated to find the optimal parameters for the Random Forest model.
Accuracies-Mean: 88.89%
Clone the repository or download the files. Run the Caps_Data.ipynb notebook in a Jupyter environment.
Python 3.x Libraries: numpy, matplotlib, pandas, seaborn, scikit-learn
1.Harshit Tiwari 2.Shashwat Saxena 2.Akshay Mathur 4.Tapesh.
MIT License
API Link
Ensure Bhopal_Crowd_Data.csv is in the same directory as the notebook before running the code.