Skip to content

Latest commit

 

History

History
36 lines (23 loc) · 1.09 KB

README.md

File metadata and controls

36 lines (23 loc) · 1.09 KB

Principal Component Analysis PROJECT

Context

Consider the situation where you are working for Zillow as a data scientist

Housing pricing predictions is the goal.

We know 80 things about each house to use as inputs to be able to predict the price of a house.

Your goal is to isolate the important features from the dataset and build a model which can be used to predict the price of the houses.

Since there are too many features, PCA can be applied to reduce the number of features used for the actual prediction model, without any loss of information.

Assignments

Data Cleaup and Exploratory Data Analysis

  1. Explore Basic Statistics of each feature
  2. Outlier Detection
  3. Missing value imputations
  4. Correlation Analysis

Feature Preparation and Transformation

  1. Drop unnecessary Columns
  2. Apply Scaling to dataset to bring all variables to the same scale
  3. Feature Selection for isolating final set of variables for PCA

PCA

  1. Threshold for Variance
  2. Balance the number of features selected

Linear Regresssion

  1. Fit model to cleaned-up dataset
  2. Comparative Study of with and without PCA