SDDec 29, 2016

What Makes Audio Event Detection Harder than Classification?

Huy Phan, Philipp Koch, Fabrice Katzberg, Marco Maass, Radoslaw Mazur, Ian McLoughlin, Alfred Mertins

arXiv:1612.09089v48.710 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving audio event detection accuracy for applications like surveillance or multimedia analysis, though it is incremental as it builds on existing detection and classification methods.

The paper analyzes why audio event detection is harder than classification and proposes a verification step using a high-quality classifier to reduce false alarms in detection systems. Experiments on the ITC-Irst dataset show significant and consistent performance improvements across various detector-classifier combinations.

There is a common observation that audio event classification is easier to deal with than detection. So far, this observation has been accepted as a fact and we lack of a careful analysis. In this paper, we reason the rationale behind this fact and, more importantly, leverage them to benefit the audio event detection task. We present an improved detection pipeline in which a verification step is appended to augment a detection system. This step employs a high-quality event classifier to postprocess the benign event hypotheses outputted by the detection system and reject false alarms. To demonstrate the effectiveness of the proposed pipeline, we implement and pair up different event detectors based on the most common detection schemes and various event classifiers, ranging from the standard bag-of-words model to the state-of-the-art bank-of-regressors one. Experimental results on the ITC-Irst dataset show significant improvements to detection performance. More importantly, these improvements are consistent for all detector-classifier combinations.

View on arXiv PDF

Similar