SDCLCYMMASSep 13, 2025

Unpacking Musical Symbolism in Online Communities: Content-Based and Network-Centric Approaches

arXiv:2510.00006v1
Originality Synthesis-oriented
AI Analysis

This work addresses how musical trends reflect cultural shifts in online communities, though it is incremental in combining existing methods.

This paper analyzes musical symbolism in online communities by combining content-based music analysis with network analysis of lyrics, finding a decade-long decline in energy (79 to 58) and rise in danceability (59 to 73) in chart-topping songs, with mood varying systematically by genre such as R&B having the highest mean valence (96).

This paper examines how musical symbolism is produced and circulated in online communities by combining content-based music analysis with a lightweight network perspective on lyrics. Using a curated corpus of 275 chart-topping songs enriched with audio descriptors (energy, danceability, loudness, liveness, valence, acousticness, speechiness, popularity) and full lyric transcripts, we build a reproducible pipeline that (i) quantifies temporal trends in sonic attributes, (ii) models lexical salience and co-occurrence, and (iii) profiles mood by genre. We find a decade-long decline in energy (79 -> 58) alongside a rise in danceability (59 -> 73); valence peaks in 2013 (63) and dips in 2014-2016 (42) before partially recovering. Correlation analysis shows strong coupling of energy with loudness (r = 0.74) and negative associations for acousticness with both energy (r = -0.54) and loudness (r = -0.51); danceability is largely orthogonal to other features (|r| < 0.20). Lyric tokenization (>114k tokens) reveals a pronoun-centric lexicon "I/you/me/my" and a dense co-occurrence structure in which interpersonal address anchors mainstream narratives. Mood differs systematically by style: R&B exhibits the highest mean valence (96), followed by K-Pop/Pop (77) and Indie/Pop (70), whereas Latin/Reggaeton is lower (37) despite high danceability. Read through a subcultural identity lens, these patterns suggest the mainstreaming of previously peripheral codes and a commercial preference for relaxed yet rhythmically engaging productions that sustain collective participation without maximal intensity. Methodologically, we contribute an integrated MIR-plus-network workflow spanning summary statistics, correlation structure, lexical co-occurrence matrices, and genre-wise mood profiling that is robust to modality sparsity and suitable for socially aware recommendation or community-level diffusion studies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes