In this project, supervised and unsupervised machine learning algorithms are used to analyze demographical datasets of a general population as well as of customers of a German mail-order company. The goal of the project is twofold:
- Cluster the datasets into groups to find out characteristics of existing customers and differences to general population
- Develop a forecasting model to predict and identify prospective customer response for a marketing campaign
This is one of Udacity’s capstone project for the Data Science Nanodegree program. The data is provided by Arvato Financial Services, a Bertelsmann subsidiary.
The complete project report can be found in this blogpost.
The project code was written in Python 3.5 using Jupyter Notebook. All programs and libraries that were used, including Pandas, Numpy, and scikit-learn, are part of the Anaconda suite.
Kudos to Udacity and Bertelsmann/Arvato for providing this fun and challenging project! Tipping my hat also Elena Ivanova (lenuel) and DeepVen who inspired parts of the solutions.