--
This notebook is easier to read on Kaggle. Please, click here to see it on Kaggle, where plotly charts are fully interactive!
Loans are an essential part of our economy.People borrow money from financial institutions all the time, either for starting a business, emergency expenses, vehicle financing, vacation costs, or education costs.
However, when lending money to someone, there is always the risk that that person may not be able to pay you back. When it comes to financial institutions, such as banks, that borrow large amounts of money to many different people for many different reasons, the risk of losses from defaults gets exponentially higher.
For this reason, it is extremely important that financial institutions avoid loans to people that are highly likely to default, and they usually invest a lot of time and resources in background checks on people to avoid having losses. In this notebook, I'll develop a machine learning model that will be able to predict how likely a client is to default based on whether or not he's employed, his bank balance, and his annual salary.
To develop this loan default predictor, I've used the Loan Default Prediction dataset on Kaggle, which is a synthetic dataset created using actual data from a financial institution, containing data from 10,000 clients. It's important to notice that this data has been transformed in order to avoid identification of these clients and this institution.
The attirbutes in this dataset are as follows:
-
Employed: 1 for employed and 0 for unemployed;
-
Bank Balance: The amount of money that client had available in their account at the moment the data was obtained;
-
Annual Salary: The annual salary of each client;
-
Defaulted?: This is our target variable and it's filled of 0 for each client who didn't default and 1 for each client who defaulted their loans.
I've used some EDA techniques to evaluate how each attributed interacted with each other and how relevant they were to the target variable.
- pandas
- numpy
- plotly
- matplotlib
- seaborn
- sklearn
- pycaret
Author
Luis Fernando Torres