SDLGASApr 17, 2021

Uncovering audio patterns in music with Nonnegative Tucker Decomposition for structural segmentation

arXiv:2104.08580v116 citations
Originality Incremental advance
AI Analysis

This work addresses music analysis for pop songs, offering a method that could compete with example-based learning schemes, but it appears incremental as it builds on prior tensor decomposition techniques.

The authors tackled the problem of uncovering musical patterns and structure in pop songs from audio using Nonnegative Tucker Decomposition (NTD), resulting in features that enable structural segmentation with experimental results on the RWC Pop dataset that potentially challenge state-of-the-art approaches.

Recent work has proposed the use of tensor decomposition to model repetitions and to separate tracks in loop-based electronic music. The present work investigates further on the ability of Nonnegative Tucker Decompositon (NTD) to uncover musical patterns and structure in pop songs in their audio form. Exploiting the fact that NTD tends to express the content of bars as linear combinations of a few patterns, we illustrate the ability of the decomposition to capture and single out repeated motifs in the corresponding compressed space, which can be interpreted from a musical viewpoint. The resulting features also turn out to be efficient for structural segmentation, leading to experimental results on the RWC Pop data set which are potentially challenging state-of-the-art approaches that rely on extensive example-based learning schemes.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes