Predicting the year in million song dataset with machine learning (mllib package) using pyspark
"Year Prediction MSD Dataset" from UCI Machine Learning Repository is used for this project https://archive.ics.uci.edu/ml/datasets/yearpredictionmsd- Load the dataset and use min max scaling to scale features between 0 and 1
- Normalize the labels by subtracting min year
- Split the dataset into train (70%), test (20%), and validation (10%) set