SDASJan 24, 2020

Learning Multi-instrument Classification with Partial Labels

arXiv:2001.08864v15 citations
AI Analysis

This work solves the problem of accurate instrument classification for audio analysis applications, but it is incremental as it builds on existing methods for weakly labeled data.

The paper tackled the problem of multi-instrument recognition in audio clips using deep learning, addressing challenges from weakly and partially labeled data in the OpenMIC dataset, and achieved state-of-the-art results.

Multi-instrument recognition is the task of predicting the presence or absence of different instruments within an audio clip. A considerable challenge in applying deep learning to multi-instrument recognition is the scarcity of labeled data. OpenMIC is a recent dataset containing 20K polyphonic audio clips. The dataset is weakly labeled, in that only the presence or absence of instruments is known for each clip, while the onset and offset times are unknown. The dataset is also partially labeled, in that only a subset of instruments are labeled for each clip. In this work, we investigate the use of attention-based recurrent neural networks to address the weakly-labeled problem. We also use different data augmentation methods to mitigate the partially-labeled problem. Our experiments show that our approach achieves state-of-the-art results on the OpenMIC multi-instrument recognition task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes