ASLGSDFeb 23, 2024

Toward Fully Self-Supervised Multi-Pitch Estimation

arXiv:2402.15569v17 citationsh-index: 6
Originality Highly original
AI Analysis

This addresses the shortage of large-scale annotated datasets for multi-pitch estimation in music analysis, offering a self-supervised alternative that generalizes well.

The paper tackles the problem of multi-pitch estimation in polyphonic music mixtures by introducing a fully self-supervised learning framework, achieving performance comparable to supervised models without using annotated data.

Multi-pitch estimation is a decades-long research problem involving the detection of pitch activity associated with concurrent musical events within multi-instrument mixtures. Supervised learning techniques have demonstrated solid performance on more narrow characterizations of the task, but suffer from limitations concerning the shortage of large-scale and diverse polyphonic music datasets with multi-pitch annotations. We present a suite of self-supervised learning objectives for multi-pitch estimation, which encourage the concentration of support around harmonics, invariance to timbral transformations, and equivariance to geometric transformations. These objectives are sufficient to train an entirely convolutional autoencoder to produce multi-pitch salience-grams directly, without any fine-tuning. Despite training exclusively on a collection of synthetic single-note audio samples, our fully self-supervised framework generalizes to polyphonic music mixtures, and achieves performance comparable to supervised models trained on conventional multi-pitch datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes