Dataset provided by HELP International. The objective is to categorize countries according to the overall development using socio-economic and health factors.
pandas:
Data analysis and manipulation tool.
matplotlib:
Visualization library.
seaborn:
Data visualization library based on matplotlib, it enhances the style of matplotlib plots.
Numpy:
Numerical analysis library.
scikit-learn:
Machine Learning library.
Bokeh:
Library for interactive data visualization.
Plotly Express:
High-level Python visualization library.
After a brief exploratory data analysis, several unsupervised algorithms such as Kmeans, Affinity Propagation and Gaussian Mixture Model are used to group countries into three categories.
t-Distributed Stochastic Neighbor Embedding (t-SNE) is a non-linear technique for dimensionality reduction that is particularly well suited for the visualization of high-dimensional datasets. Data is reduced in two dimensions using t-SNE and plotted with Bokeh.
Interactive map visualizations are used to show the result of the previous analysis.