leanding_loan_club_defaulters_prediction

1. Objective: Task is to determine if person is defaulter or not => Classification problem.

1.1 Data overview:

Number of data points: 42,538
Attributes: 115

1.2 Type of Machine learning problem

Task is to predict views of TED talk videos => Regression problem

1.3 Performance Metrics used

Accuracy
Confusion Matrix
Prcision , Recall & F1 Score

2. Exploratory Data Analysis

2.1 Distribution of target variable

- Observation: Imbalanced Data

2.1 Bivariate analysis on Interest Rate V/S Target variable (Defaulters)

- Observation: Even if we have imbalanced data people with higher interest rate tends to fall under defaulter category

2.1 Bivariate analysis on Interest Rate V/S Month Term

- Observation: People with higher Interest rate tends to have 60 months term instead of 36

2.3 Bivariate analysis on Interest Rate V/S Grades (views)

- Interest rate between 5-9 falls under A grade - Interest rate between 10-12 falls under B grade - Interest rate between 13-16 falls under C & D grade - Interest rate greater than 17 has the risk of comming under Defaulters category i.e. Grade F&G

3. Feature Engineering

Unecessary variables which contains zip , id or only single category has been dropped
Features like Grade, Interst rate in %, sub_grade which holds higher information are been handled with Ordinal encoding.

4. Feature Selection

4.1 f_regression to get feature importance, Dropped features with higher P-value with threshold > 0.3

5. Comparison of Accuracies on diffrent Machine learning Models used.

1. Gaussion Naive Bayes

2. Logistic Regression

3. Decision Trees

4. Random Forest Classifier

Conclusion & Buisness values

So after loading the data we started with EDA process to understand the data through diffrent types of Univariate,Bivariate & Multivaiate tools and also handles outliers and NaN values. In feature engineering we have created some of the features as well as removed some unwanted features which added less value . Features such as Grade, Sub_Grade, Intreset rate, term etc etc played major roles to understand wheather the person is defaulter or not. Finally we have compared all the models w.r.t.o their Acc, Confusion Matrix , Precison & Recall and all the models have performed better in this case.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
README.md		README.md
image1.png		image1.png
image2.png		image2.png
image3.png		image3.png
image4.png		image4.png
image5.png		image5.png
target.png		target.png
🏦_Lending_Club_Loan_💰_Defaulters_🏃‍♂_Prediction_Part_1_EDA.ipynb		🏦_Lending_Club_Loan_💰_Defaulters_🏃‍♂_Prediction_Part_1_EDA.ipynb
🏦_Lending_Club_Loan_💰_Defaulters_🏃‍♂_Prediction_Part_2_Feature_Engineering.ipynb		🏦_Lending_Club_Loan_💰_Defaulters_🏃‍♂_Prediction_Part_2_Feature_Engineering.ipynb
🏦_Lending_Club_Loan_💰_Defaulters_🏃‍♂_Prediction_Part_3_Modelling.ipynb		🏦_Lending_Club_Loan_💰_Defaulters_🏃‍♂_Prediction_Part_3_Modelling.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

leanding_loan_club_defaulters_prediction

1. Objective: Task is to determine if person is defaulter or not => Classification problem.

1.1 Data overview:

1.2 Type of Machine learning problem

1.3 Performance Metrics used

2. Exploratory Data Analysis

2.1 Distribution of target variable

2.1 Bivariate analysis on Interest Rate V/S Target variable (Defaulters)

2.1 Bivariate analysis on Interest Rate V/S Month Term

2.3 Bivariate analysis on Interest Rate V/S Grades (views)

3. Feature Engineering

4. Feature Selection

4.1 f_regression to get feature importance, Dropped features with higher P-value with threshold > 0.3

5. Comparison of Accuracies on diffrent Machine learning Models used.

1. Gaussion Naive Bayes

2. Logistic Regression

3. Decision Trees

4. Random Forest Classifier

Conclusion & Buisness values

About

Releases

Packages

Languages

nihar-max/leanding_loan_club_defaulters_prediction

Folders and files

Latest commit

History

Repository files navigation

leanding_loan_club_defaulters_prediction

1. Objective: Task is to determine if person is defaulter or not => Classification problem.

1.1 Data overview:

1.2 Type of Machine learning problem

1.3 Performance Metrics used

2. Exploratory Data Analysis

2.1 Distribution of target variable

2.1 Bivariate analysis on Interest Rate V/S Target variable (Defaulters)

2.1 Bivariate analysis on Interest Rate V/S Month Term

2.3 Bivariate analysis on Interest Rate V/S Grades (views)

3. Feature Engineering

4. Feature Selection

4.1 f_regression to get feature importance, Dropped features with higher P-value with threshold > 0.3

5. Comparison of Accuracies on diffrent Machine learning Models used.

1. Gaussion Naive Bayes

2. Logistic Regression

3. Decision Trees

4. Random Forest Classifier

Conclusion & Buisness values

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages