SD LG ASJun 21, 2021

Affinity Mixup for Weakly Supervised Sound Event Detection

Mohammad Rasool Izadi, Robert Stevenson, Laura N. Kloepper

arXiv:2106.11233v12.3

Originality Incremental advance

AI Analysis

This addresses the problem of detecting sound events with weak labels for applications like audio analysis, though it appears incremental as it builds on existing attention and graph neural network concepts.

The paper tackles weakly supervised sound event detection by introducing affinity mixup, a regularization technique that incorporates time-level similarities between frames using an adaptive affinity matrix. This approach improves event-F1 scores by 8.2% over state-of-the-art methods.

The weakly supervised sound event detection problem is the task of predicting the presence of sound events and their corresponding starting and ending points in a weakly labeled dataset. A weak dataset associates each training sample (a short recording) to one or more present sources. Networks that solely rely on convolutional and recurrent layers cannot directly relate multiple frames in a recording. Motivated by attention and graph neural networks, we introduce the concept of an affinity mixup to incorporate time-level similarities and make a connection between frames. This regularization technique mixes up features in different layers using an adaptive affinity matrix. Our proposed affinity mixup network improves over state-of-the-art techniques event-F1 scores by $8.2\%$.

View on arXiv PDF

Similar