SDLGASFeb 5, 2022

SEED: Sound Event Early Detection via Evidential Uncertainty

arXiv:2202.02441v213 citations
AI Analysis

This work addresses unreliable early detection in sound event recognition, which is incremental but offers specific performance gains for acoustic environment monitoring.

The paper tackles the problem of over-confidence in early-stage sound event detection by proposing a Polyphonic Evidential Neural Network (PENet) that models uncertainty with Beta distributions and uses backtrack inference, resulting in a 13.0% improvement in time delay and a 3.8% gain in detection F1 score compared to state-of-the-art methods.

Sound Event Early Detection (SEED) is an essential task in recognizing the acoustic environments and soundscapes. However, most of the existing methods focus on the offline sound event detection, which suffers from the over-confidence issue of early-stage event detection and usually yield unreliable results. To solve the problem, we propose a novel Polyphonic Evidential Neural Network (PENet) to model the evidential uncertainty of the class probability with Beta distribution. Specifically, we use a Beta distribution to model the distribution of class probabilities, and the evidential uncertainty enriches uncertainty representation with evidence information, which plays a central role in reliable prediction. To further improve the event detection performance, we design the backtrack inference method that utilizes both the forward and backward audio features of an ongoing event. Experiments on the DESED database show that the proposed method can simultaneously improve 13.0\% and 3.8\% in time delay and detection F1 score compared to the state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes