SDAIASMay 22, 2025

Layer-wise Investigation of Large-Scale Self-Supervised Music Representation Models

arXiv:2505.16306v13 citationsh-index: 7
Originality Synthesis-oriented
AI Analysis

This work addresses the limited understanding of encoded information in pre-trained music models, which is incremental for researchers in music information retrieval.

The study analyzed self-supervised learning models for music information retrieval, focusing on validating their advantages across tasks and exploring layer-wise specialization, revealing insights into model structure and applications.

Recently, pre-trained models for music information retrieval based on self-supervised learning (SSL) are becoming popular, showing success in various downstream tasks. However, there is limited research on the specific meanings of the encoded information and their applicability. Exploring these aspects can help us better understand their capabilities and limitations, leading to more effective use in downstream tasks. In this study, we analyze the advanced music representation model MusicFM and the newly emerged SSL model MuQ. We focus on three main aspects: (i) validating the advantages of SSL models across multiple downstream tasks, (ii) exploring the specialization of layer-wise information for different tasks, and (iii) comparing performance differences when selecting specific layers. Through this analysis, we reveal insights into the structure and potential applications of SSL models in music information retrieval.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes