This contains my supervised and unsupervised machine learning project for APAN5205 Applied Analytics Frameworks and Methods 2 at Columbia University.
I used unsupervised techniques such as clustering, text mining and sentiment analysis, and supervised learning such as logistic regression and decision trees to better predict attrition at IBM. The findings concluded that sentiment analysis on Glassdoor reviews (scraped data) together with clustering the IBM dataset before prediction improved the prediction accuracy of the model.