IRFeb 4, 2019

Enhanced Hierarchical Music Structure Annotations via Feature Level Similarity Fusion

arXiv:1902.01023v117 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving music structure analysis for applications in music information retrieval, though it appears incremental by building on prior spectral clustering approaches.

The paper tackled the problem of automatically discovering hierarchical repeated sections in musical audio by proposing a method that uses similarity network fusion (SNF) to combine frame-level features into affinity matrices for spectral clustering, resulting in segmentations that better agree with human annotators as measured by the L-measure metric.

We describe a novel pipeline to automatically discover hierarchies of repeated sections in musical audio. The proposed method uses similarity network fusion (SNF) to combine different frame-level features into clean affinity matrices, which are then used as input to spectral clustering. While prior spectral clustering approaches to music structure analysis have pre-processed affinity matrices with heuristics specifically designed for this task, we show that the SNF approach directly yields segmentations which agree better with human annotators, as measured by the ``L-measure'' metric for hierarchical annotations. Furthermore, the SNF approach immediately supports arbitrarily many input features, allowing us to simultaneously discover structure encoded in timbral, harmonic, and rhythmic representations without any changes to the base algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes