Song Hit Prediction: Predicting Billboard Hits Using Spotify Data
This work addresses the problem of predicting commercial song success for the music industry, but it is incremental as it applies existing methods to new data.
The paper tackled the Hit Song Science problem by predicting Billboard hits using Spotify audio features, achieving 88% accuracy with a random forest model on a dataset of 1.8 million songs.
In this work, we attempt to solve the Hit Song Science problem, which aims to predict which songs will become chart-topping hits. We constructed a dataset with approximately 1.8 million hit and non-hit songs and extracted their audio features using the Spotify Web API. We test four models on our dataset. Our best model was random forest, which was able to predict Billboard song success with 88% accuracy.