Weakly Supervised Scalable Audio Content Analysis
This work addresses the problem of high annotation costs in audio content analysis for multimedia applications, presenting an incremental improvement over supervised methods.
The authors tackled audio event detection by proposing a weakly supervised learning framework that reduces annotation effort using web multimedia data, demonstrating feasibility with multiple instance learning algorithms and achieving competitive performance.
Audio Event Detection is an important task for content analysis of multimedia data. Most of the current works on detection of audio events is driven through supervised learning approaches. We propose a weakly supervised learning framework which can make use of the tremendous amount of web multimedia data with significantly reduced annotation effort and expense. Specifically, we use several multiple instance learning algorithms to show that audio event detection through weak labels is feasible. We also propose a novel scalable multiple instance learning algorithm and show that its competitive with other multiple instance learning algorithms for audio event detection tasks.