house-price-prediction

This project aims to develop a Machine Learning model to predict California housing prices. The model predicts the median housing price of a district, helping determine whether investing in that area is worthwhile.

Project Organization

├── data
│   └── housing.csv             <- Data from kaggle.
├── images                      <- images for visualization 
├── models                      <- Trained Models
├── notebooks
│   ├── preparation_notebooks   <- Only necessary Notebooks for model production; data preparation, pipeline creation, parameter tuning etc.
│   ├── testing_notebooks       <- Every notebook for quick test; dump and test notebooks
│   └── main.ipynb              <- Main Notebook
├── requirements.txt            <- The requirements file, generated with `pip freeze > requirements.txt`
└── Readme.md                   <- Project Explanation, notes etc.

Project Objectives

Analyze and preprocess the dataset.
Train different regression models and compare their performances.
Select the best-performing 3 to 5 models and perform hyperparameter tuning.
Select the best-performing 2 or 3 model, ensemble these and compare their performances.
Get best model out of these.

Design and Implementation Details

Supervised Learning: The model is trained with labeled examples.
Regression Task: The model is used to predict a value median-house-price.
Data Preprocessing: The dataset is prepared by handling missing data, processing outliers, and feature engineering, transformation, extraction steps.
Model Selection: 14 different regression models are trained and their performances are compared.
Hyperparameter Tuning: The hyperparameters of the 5 best-performing model are tuned.
Selection of Model: After ensemble GradienBoostingRegressor and LGBMRegressor, decided to use LGBMRegressor Model

Results and Improvement Recommendations

RMSE values for the performance of the LGBMRegressor Model on the training, testing, and validation sets are reported on main.ipynb.
- There is overfitting issue going on but not much. Test scores and validation scores is acceptable.
Data augmentation and further hyperparameter tuning are recommended for model improvement.

How to Use

Navigate to the project directory.
Install the necessary dependencies by running pip install -r requirements.txt
Open notebooks directory then main.ipynb notebook and run it. This will run whole projects, and can take some time.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
data		data
images		images
models		models
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

house-price-prediction

Project Organization

Project Objectives

Design and Implementation Details

Results and Improvement Recommendations

How to Use

About

Releases

Packages

Languages

License

enescatagan/house-price-prediction

Folders and files

Latest commit

History

Repository files navigation

house-price-prediction

Project Organization

Project Objectives

Design and Implementation Details

Results and Improvement Recommendations

How to Use

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages