From Joy to Fear: A Benchmark of Emotion Estimation in Pop Song Lyrics
This work addresses emotion estimation in song lyrics for music information retrieval applications, but it is incremental as it applies existing methods to a new dataset.
This paper tackles the problem of estimating emotional content in pop song lyrics by predicting six emotional intensity scores, constructing a manually labeled dataset using mean opinion scores and evaluating large language models (LLMs) in zero-shot scenarios and a fine-tuned BERT model, finding that LLMs show potential for emotion recognition in creative texts.
The emotional content of song lyrics plays a pivotal role in shaping listener experiences and influencing musical preferences. This paper investigates the task of multi-label emotional attribution of song lyrics by predicting six emotional intensity scores corresponding to six fundamental emotions. A manually labeled dataset is constructed using a mean opinion score (MOS) approach, which aggregates annotations from multiple human raters to ensure reliable ground-truth labels. Leveraging this dataset, we conduct a comprehensive evaluation of several publicly available large language models (LLMs) under zero-shot scenarios. Additionally, we fine-tune a BERT-based model specifically for predicting multi-label emotion scores. Experimental results reveal the relative strengths and limitations of zero-shot and fine-tuned models in capturing the nuanced emotional content of lyrics. Our findings highlight the potential of LLMs for emotion recognition in creative texts, providing insights into model selection strategies for emotion-based music information retrieval applications. The labeled dataset is available at https://github.com/LLM-HITCS25S/LyricsEmotionAttribution.