SpotHitPy: A Study For ML-Based Song Hit Prediction Using Spotify
This addresses the hit song prediction problem for the music industry, but it is incremental as it applies existing methods to a new dataset.
The study tackled predicting Billboard hit songs using Spotify audio features, achieving approximately 86% accuracy with Random Forest and Support Vector Machine as the most successful models.
In this study, we approached the Hit Song Prediction problem, which aims to predict which songs will become Billboard hits. We gathered a dataset of nearly 18500 hit and non-hit songs and extracted their audio features using the Spotify Web API. We test four machine-learning models on our dataset. We were able to predict the Billboard success of a song with approximately 86\% accuracy. The most succesful algorithms were Random Forest and Support Vector Machine.