This project utilizes a linear regression model built in Python, specifically using pandas, matplotlib, seaborn, and scikit-learn. The CrisDM lifecycle methodology guides the project, ensuring a comprehensive approach from data understanding to deployment.
The dataset focuses on Bangalore house prices. During the data understanding phase, exploratory data analysis (EDA) and cleaning were performed. The process involved checking data, summary statistics, data types, and addressing missing values. Feature engineering was conducted to enhance predictive capabilities.
In the model development phase, a linear regression model achieved an impressive accuracy of approximately 90%. K-fold cross-validation and grid search CV were employed to fine-tune the model and identify optimal hyperparameters.
The model and web application have been successfully deployed on Streamlit. The deployment process involved preparing the clean data, deploying the model, and creating a Streamlit web application to interact with the model.
Feel free to contribute or provide feedback. Happy predicting!