SD CV ASMay 19, 2021

Unsupervised Discriminative Learning of Sounds for Audio Event Classification

Sascha Hornauer, Ke Li, Stella X. Yu, Shabnam Ghaffarzadegan, Liu Ren

arXiv:2105.09279v22.3

Originality Incremental advance

AI Analysis

This provides a faster alternative for audio event classification, though it is incremental as it builds on existing pre-training approaches.

The paper tackles the problem of time-consuming pre-training on visual data for audio event classification by proposing an unsupervised discriminative learning method that uses only audio data, achieving on-par performance with ImageNet pre-training on several benchmarks.

Recent progress in network-based audio event classification has shown the benefit of pre-training models on visual data such as ImageNet. While this process allows knowledge transfer across different domains, training a model on large-scale visual datasets is time consuming. On several audio event classification benchmarks, we show a fast and effective alternative that pre-trains the model unsupervised, only on audio data and yet delivers on-par performance with ImageNet pre-training. Furthermore, we show that our discriminative audio learning can be used to transfer knowledge across audio datasets and optionally include ImageNet pre-training.

View on arXiv PDF

Similar