AS LG SD MLSep 4, 2019

Exploiting Parallel Audio Recordings to Enforce Device Invariance in CNN-based Acoustic Scene Classification

Paul Primus, Hamid Eghbal-zadeh, David Eitelsebner, Khaled Koutini, Andreas Arzt, Gerhard Widmer

arXiv:1909.02869v110.317 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of device variability in machine listening applications, offering a cost-effective solution for domain adaptation, though it appears incremental as it builds on existing CNN-based methods.

The paper tackles the problem of distribution mismatches between training and application data in acoustic scene classification by proposing a domain adaptation method that enforces device invariance using parallel audio recordings, achieving domain-invariant classifiers without needing classification labels.

Distribution mismatches between the data seen at training and at application time remain a major challenge in all application areas of machine learning. We study this problem in the context of machine listening (Task 1b of the DCASE 2019 Challenge). We propose a novel approach to learn domain-invariant classifiers in an end-to-end fashion by enforcing equal hidden layer representations for domain-parallel samples, i.e. time-aligned recordings from different recording devices. No classification labels are needed for our domain adaptation (DA) method, which makes the data collection process cheaper.

View on arXiv PDF Code

Similar