SDASDec 4, 2018

Learning to match transient sound events using attentional similarity for few-shot sound recognition

arXiv:1812.01269v265 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of quickly adapting classifiers to new sound events with limited data, which is incremental as it enhances existing metric-based learning methods.

The paper tackles the problem of few-shot sound recognition by introducing an attentional similarity module that improves the ability to match transient sound events, achieving relative accuracy improvements of up to +7.7% on the ESC-50 dataset.

In this paper, we introduce a novel attentional similarity module for the problem of few-shot sound recognition. Given a few examples of an unseen sound event, a classifier must be quickly adapted to recognize the new sound event without much fine-tuning. The proposed attentional similarity module can be plugged into any metric-based learning method for few-shot learning, allowing the resulting model to especially match related short sound events. Extensive experiments on two datasets shows that the proposed module consistently improves the performance of five different metric-based learning methods for few-shot sound recognition. The relative improvement ranges from +4.1% to +7.7% for 5-shot 5-way accuracy for the ESC-50 dataset, and from +2.1% to +6.5% for noiseESC-50. Qualitative results demonstrate that our method contributes in particular to the recognition of transient sound events.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes