SDFeb 10, 2021

Sound Event Detection Based on Curriculum Learning Considering Learning Difficulty of Events

Noriyuki Tonami, Keisuke Imoto, Yuki Okamoto, Takahiro Fukumori, Yoichi Yamashita

arXiv:2102.05288v15.94 citations

Originality Incremental advance

AI Analysis

This work addresses sound event detection for audio analysis applications, offering an incremental improvement by adapting curriculum learning to handle training difficulty differences.

The paper tackles the problem of sound event detection by addressing the difficulty imbalance between events that are present versus absent in acoustic scenes, proposing a curriculum learning approach that trains from easy-to-train to difficult-to-train events, resulting in an F-score improvement of 10.09 percentage points over conventional methods.

In conventional sound event detection (SED) models, two types of events, namely, those that are present and those that do not occur in an acoustic scene, are regarded as the same type of events. The conventional SED methods cannot effectively exploit the difference between the two types of events. All time frames of sound events that do not occur in an acoustic scene are easily regarded as inactive in the scene, that is, the events are easy-to-train. The time frames of the events that are present in a scene must be classified as active in addition to inactive in the acoustic scene, that is, the events are difficult-to-train. To take advantage of the training difficulty, we apply curriculum learning into SED, where models are trained from easy- to difficult-to-train events. To utilize the curriculum learning, we propose a new objective function for SED, wherein the events are trained from easy- to difficult-to-train events. Experimental results show that the F-score of the proposed method is improved by 10.09 percentage points compared with that of the conventional binary cross entropy-based SED.

View on arXiv PDF

Similar