ASLGSDMLSep 4, 2019

Exploiting Parallel Audio Recordings to Enforce Device Invariance in CNN-based Acoustic Scene Classification

arXiv:1909.02869v117 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of device variability in machine listening applications, offering a cost-effective solution for domain adaptation, though it appears incremental as it builds on existing CNN-based methods.

The paper tackles the problem of distribution mismatches between training and application data in acoustic scene classification by proposing a domain adaptation method that enforces device invariance using parallel audio recordings, achieving domain-invariant classifiers without needing classification labels.

Distribution mismatches between the data seen at training and at application time remain a major challenge in all application areas of machine learning. We study this problem in the context of machine listening (Task 1b of the DCASE 2019 Challenge). We propose a novel approach to learn domain-invariant classifiers in an end-to-end fashion by enforcing equal hidden layer representations for domain-parallel samples, i.e. time-aligned recordings from different recording devices. No classification labels are needed for our domain adaptation (DA) method, which makes the data collection process cheaper.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes