The project is developed on a two "csv" dataset available on the Kaggle platform. The data have been obtained by Spotify. Te main one, "track.csv" the most important and largest, contains music tracks informations from a period of 100 years. The other one instead, "artist.csv",contains a row for each artist. Both the file are comprressed in data.rar. Basing on the suggestion of the dataset's author, we identified three main analysis to apply on the data :
- Clustering: on the songs, to identify a limited number of genres
- Classification/Regression: to understand which are the most important features in estimating the popularity of a song
- Trend Analysis: to see how musical creation changed above the years