The company provides a mobile app which has both free and paid version of the product. The aim is to predict which users will subscribe to paid membership.
- Data import
- Load libraries
- Load dataset
- Exploratory data analysis (EDA)
- Data cleaning
- Convert column type
- Remove unnecessary columns
- Data visualizing
- Plot histogram
- Correlation with response
- Correlation matrix
- Compute the correlation matrix
- Generate a mask for the upper triangle
- Draw the heatmap with the mask and correct aspect ratio
- Feature engineering
- Formatting date columns
- Selecting time for response
- Load top screens
- Mapping screens to fields
- Funnels
- Data preprocessing
- Splitting independent and response variables
- Splitting the dataset into the training set and test set
- Removing identifiers
- Feature scaling
- Model building
- Fitting model to the training set
- Predicting test set
- Evaluating results
- Print confusion matrix
- Print heatmap
- Print classification report
- Formatting final results
We ended up with almost 100% accuracy of our model which is very good.
This implementation was inspired by Kirill Eremenko, Hadelin de Ponteves, Dr. Ryan Ahmed, Ph.D., MBA, SuperDataScience Team, Rony Sulca Machine Learning Practical Udemy course