A Dataset and Baselines for Measuring and Predicting the Music Piece Memorability
This work addresses the problem of understanding what makes music memorable for applications like recommendation systems, though it is incremental as it builds on existing concepts with new data and methods.
The authors tackled the problem of measuring and predicting music memorability by collecting a new dataset with reliable labels and training baselines using interpretable features and audio mel-spectrograms, showing that prediction is possible with limited data and identifying factors like higher valence, arousal, and faster tempo as contributors.
Nowadays, humans are constantly exposed to music, whether through voluntary streaming services or incidental encounters during commercial breaks. Despite the abundance of music, certain pieces remain more memorable and often gain greater popularity. Inspired by this phenomenon, we focus on measuring and predicting music memorability. To achieve this, we collect a new music piece dataset with reliable memorability labels using a novel interactive experimental procedure. We then train baselines to predict and analyze music memorability, leveraging both interpretable features and audio mel-spectrograms as inputs. To the best of our knowledge, we are the first to explore music memorability using data-driven deep learning-based methods. Through a series of experiments and ablation studies, we demonstrate that while there is room for improvement, predicting music memorability with limited data is possible. Certain intrinsic elements, such as higher valence, arousal, and faster tempo, contribute to memorable music. As prediction techniques continue to evolve, real-life applications like music recommendation systems and music style transfer will undoubtedly benefit from this new area of research.