Here is my notebook on the classification problem of predicting the employees at risk, working with the dataset
https://www.kaggle.com/manasdalakoti/univai-hack-data
Hello everyone, This is my take on the binary classification of determining employees who are at a risk of termination or not.
It is a Binary Classification Problem. The tools used are:
Pandas for data manipulation and ingestion
Numpy for multidimensional array computing
Matplotlib and seaborn for data visualization
Word Cloud for geeting the most populare string
Imblearn for oversampling of the model
Scikit Learn for Data Preprocessing
Random Forest Classifier:
Accuracy Reached: 95.74%
XG Boost Classifier:
Accuracy Reached: 93.17%
Light Gradient Boosting:
Accuracy Reached: 91.10%
Cat Boost classifier:
Accuracy Reached: 95.74%
-> Feel free to leave any suggestions in the comments for the betterment of the notebook.
-> Thank you for your time,CHEERS!🌟