SDAILGDec 5, 2025

Lyrics Matter: Exploiting the Power of Learnt Representations for Music Popularity Prediction

arXiv:2512.05508v11 citations
Originality Incremental advance
AI Analysis

It addresses a critical challenge for the music industry, including artists and streaming platforms, by demonstrating the value of lyrics in popularity prediction, though it is incremental as it builds on existing multimodal approaches.

This work tackled the problem of predicting music popularity by incorporating lyrics, which were previously under-explored, using an automated pipeline with LLM-based lyric embeddings and a multimodal architecture, resulting in 9% and 20% improvements in MAE and MSE on a dataset of over 100,000 tracks.

Accurately predicting music popularity is a critical challenge in the music industry, offering benefits to artists, producers, and streaming platforms. Prior research has largely focused on audio features, social metadata, or model architectures. This work addresses the under-explored role of lyrics in predicting popularity. We present an automated pipeline that uses LLM to extract high-dimensional lyric embeddings, capturing semantic, syntactic, and sequential information. These features are integrated into HitMusicLyricNet, a multimodal architecture that combines audio, lyrics, and social metadata for popularity score prediction in the range 0-100. Our method outperforms existing baselines on the SpotGenTrack dataset, which contains over 100,000 tracks, achieving 9% and 20% improvements in MAE and MSE, respectively. Ablation confirms that gains arise from our LLM-driven lyrics feature pipeline (LyricsAENet), underscoring the value of dense lyric representations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes