Performing data preparation, cleaning, EDA, geospatial analysis (JSON shapefile), regression analysis, cluster analysis (k-means), and time-series analysis with Python. Employing Seaborn, Matplotlib and Plotly to create appropriate visualisations.
The World Happiness Report is an annual publication that explores the factors contributing to human well-being, the happiness ratings of countries and the importance of measuring happiness. Experts use responses from people in more than 140 nations to rank the world’s ‘happiest’ countries. Respondents are asked to rate their lives on a scale from 0 (worst) to 10 (best) using the Cantril Ladder.
In this analysis, we want to explore how different factors contributed to happiness across various countries over the past 5 years (2019-2023). In particular, we are interested in the role played by economic prosperity in determining happiness.
The datasets are part of a collection called World Happiness Reports 2013 – 2023 on Kaggle. We will analyse the reports from 2019 to 2023. The data is available under the Community Data License Agreement - Permissive - Version 1.0.
The JSON shapefile we need for the geospatial analysis is available in the project folder, under the folder "02 Data/Original Data".
For the time-series analysis, we decided to use a different dataset with more data points. You can find it at Nasdaq Data.
- How does the distribution of happiness scores vary across different regions and years?
- Is there a correlation between GDP per capita and happiness scores?
- Which other factors have the strongest impact on happiness scores?
- Which countries consistently maintain high or low happiness rankings over the years?
- What recommendations can be made to policymakers to improve overall well-being in different parts of the world?
The project files are organized into the following folders:
- 01 Project Management: includes the Project Brief and the Data License Agreement.
- 02 Data: divided into two subfolders:
- Original Data: contains the original data frames.
- Prepared Data: holds cleaned and wrangled data frames, ready for analysis.
- 03 Scripts: contains Jupyter notebooks with the corresponding code.
- 04 Analysis: holds the Data Sourcing Report.
- 05 Sent to Client: contains the dashboard plan.
Here's the link to the storyboard created in Tableau. This storyboard doesn’t contain every step we took as part of the analysis — only those relevant to the final results.