ASLGSDOct 30, 2019

Metric Learning with Background Noise Class for Few-shot Detection of Rare Sound Events

arXiv:1910.13724v227 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of reducing false positives in sound event detection for applications like audio monitoring, though it is incremental by extending metric learning with a background noise class.

The paper tackles the problem of few-shot detection of rare sound events in audio sequences containing background noise and other events, achieving performance comparable to a baseline system that requires extensive annotated data.

Few-shot learning systems for sound event recognition have gained interests since they require only a few examples to adapt to new target classes without fine-tuning. However, such systems have only been applied to chunks of sounds for classification or verification. In this paper, we aim to achieve few-shot detection of rare sound events, from query sequence that contain not only the target events but also the other events and background noise. Therefore, it is required to prevent false positive reactions to both the other events and background noise. We propose metric learning with background noise class for the few-shot detection. The contribution is to present the explicit inclusion of background noise as an independent class, a suitable loss function that emphasizes this additional class, and a corresponding sampling strategy that assists training. It provides a feature space where the event classes and the background noise class are sufficiently separated. Evaluations on few-shot detection tasks, using DCASE 2017 task2 and ESC-50, show that our proposed method outperforms metric learning without considering the background noise class. The few-shot detection performance is also comparable to that of the DCASE 2017 task2 baseline system, which requires huge amount of annotated audio data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes