SDAILGMMASOct 15, 2023

MERTech: Instrument Playing Technique Detection Using Self-Supervised Pretrained Model With Multi-Task Finetuning

arXiv:2310.09853v18 citationsh-index: 42
Originality Incremental advance
AI Analysis

This work addresses a domain-specific problem in music information retrieval for musicians and audio engineers, but it is incremental as it builds on existing self-supervised learning methods.

The paper tackled the problem of automatic instrument playing technique detection, which suffers from limited labeled data and class imbalance, by using a self-supervised pretrained model with multi-task finetuning on pitch and onset detection, achieving state-of-the-art performance on multiple benchmark datasets.

Instrument playing techniques (IPTs) constitute a pivotal component of musical expression. However, the development of automatic IPT detection methods suffers from limited labeled data and inherent class imbalance issues. In this paper, we propose to apply a self-supervised learning model pre-trained on large-scale unlabeled music data and finetune it on IPT detection tasks. This approach addresses data scarcity and class imbalance challenges. Recognizing the significance of pitch in capturing the nuances of IPTs and the importance of onset in locating IPT events, we investigate multi-task finetuning with pitch and onset detection as auxiliary tasks. Additionally, we apply a post-processing approach for event-level prediction, where an IPT activation initiates an event only if the onset output confirms an onset in that frame. Our method outperforms prior approaches in both frame-level and event-level metrics across multiple IPT benchmark datasets. Further experiments demonstrate the efficacy of multi-task finetuning on each IPT class.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes