MLLGSDSep 30, 2016

Optimal spectral transportation with application to music transcription

arXiv:1609.09799v241 citations
AI Analysis

This work improves music transcription systems by making them more robust to variations in sound, which is incremental but addresses a specific bottleneck in audio processing.

The paper tackled the problem of spectral unmixing in music transcription by addressing the sensitivity of typical measures to small energy displacements and timbre variations, proposing a new holistic measure based on optimal transportation that is invariant to harmonic shifts and local displacements, which led to a simplified dictionary and a fast algorithm achieving state-of-the-art performance on real musical data.

Many spectral unmixing methods rely on the non-negative decomposition of spectral data onto a dictionary of spectral templates. In particular, state-of-the-art music transcription systems decompose the spectrogram of the input signal onto a dictionary of representative note spectra. The typical measures of fit used to quantify the adequacy of the decomposition compare the data and template entries frequency-wise. As such, small displacements of energy from a frequency bin to another as well as variations of timber can disproportionally harm the fit. We address these issues by means of optimal transportation and propose a new measure of fit that treats the frequency distributions of energy holistically as opposed to frequency-wise. Building on the harmonic nature of sound, the new measure is invariant to shifts of energy to harmonically-related frequencies, as well as to small and local displacements of energy. Equipped with this new measure of fit, the dictionary of note templates can be considerably simplified to a set of Dirac vectors located at the target fundamental frequencies (musical pitch values). This in turns gives ground to a very fast and simple decomposition algorithm that achieves state-of-the-art performance on real musical data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes