This is a comprehensive machine learning project focused on predicting housing prices in Boston. It encapsulates the entire workflow of a machine learning project right from data acquisition and preprocessing, through model training with Linear Regression, to deployment. The project highlights techniques like exploratory data analysis, feature standardization, and model serialization. Deployed via Docker containers and Heroku, this project is a practical demonstration of using machine learning to address real-world issues in real estate pricing.
- Data Handling: Acquisition and preprocessing of the Boston housing dataset.
- Exploratory Data Analysis: Statistical analysis and visualization to understand underlying patterns.
- Feature Engineering: Standardization and normalization to improve model performance.
- Model Training: Implementation of Linear Regression and performance evaluation using metrics like RMSE.
- Deployment: Deployment of the trained model using Docker and Heroku for real time predictions.
- CI/CD Pipeline: Automated updates and deployment through GitHub Actions.
- CRIM - Crime Rate Per Capita
- ZN - Proportion of Residential Land Zoned
- INDUS - Proportion of Non-Retail Business Acres
- CHAS - Charles River Adjacency (Dummy Variable)
- NOX - Nitric Oxides Concentration (ppm)
- RM - Average Rooms Per Dwelling
- AGE - Proportion of Pre-1940 Built Homes
- DIS - Weighted Distance to Employment Centers
- RAD - Accessibility to Radial Highways Index
- TAX - Property Tax Rate Per $10,000
- PTRATIO - Pupil-Teacher Ratio by Town
- B - Proportion of Black Residents (Transformed)
- LSTAT - Percentage of Lower Status Population
- [Github Account] (https://github.com/)
- [VS Code IDE] (https://code.visualstudio.com/)
- [Heroku Account] (https://dashboard.heroku.com/)
- [Git Cli] (https://git-scm.com/downloads)
conda create -p venv python==3.11.7 -y
Activate the virtual environment in the cmd prompt using
conda activate venv/
Dockers helps in setting base configurations which will eliminate all the issues like configuration issues, hardware issues, OS issues
Run the command in command prompt:
python app.py
Check the model prediction by sending the below json text using postman REST API at http://127.0.0.1:5000:
{
"data": {
"CRIM": 0.00632,
"ZN": 18.2,
"INDUS": 2.31,
"CHAS": 0,
"NOX": 0.538,
"RM": 6.575,
"AGE": 25.0,
"DIS": 4.09,
"RAD": 1,
"TAX": 296,
"PTRATIO": 15.3,
"B": 396.6,
"LSTAT": 4.98
}
}
Contributions to this project are Welcome. Please fork the repository and submit a pull request.