SD LG ASFeb 5, 2022

SEED: Sound Event Early Detection via Evidential Uncertainty

Xujiang Zhao, Xuchao Zhang, Wei Cheng, Wenchao Yu, Yuncong Chen, Haifeng Chen, Feng Chen

arXiv:2202.02441v28.313 citations

Originality Incremental advance

AI Analysis

This work addresses unreliable early detection in sound event recognition, which is incremental but offers specific performance gains for acoustic environment monitoring.

The paper tackles the problem of over-confidence in early-stage sound event detection by proposing a Polyphonic Evidential Neural Network (PENet) that models uncertainty with Beta distributions and uses backtrack inference, resulting in a 13.0% improvement in time delay and a 3.8% gain in detection F1 score compared to state-of-the-art methods.

Sound Event Early Detection (SEED) is an essential task in recognizing the acoustic environments and soundscapes. However, most of the existing methods focus on the offline sound event detection, which suffers from the over-confidence issue of early-stage event detection and usually yield unreliable results. To solve the problem, we propose a novel Polyphonic Evidential Neural Network (PENet) to model the evidential uncertainty of the class probability with Beta distribution. Specifically, we use a Beta distribution to model the distribution of class probabilities, and the evidential uncertainty enriches uncertainty representation with evidence information, which plays a central role in reliable prediction. To further improve the event detection performance, we design the backtrack inference method that utilizes both the forward and backward audio features of an ongoing event. Experiments on the DESED database show that the proposed method can simultaneously improve 13.0\% and 3.8\% in time delay and detection F1 score compared to the state-of-the-art methods.

View on arXiv PDF

Similar