ASSDOct 26, 2021

Towards Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

arXiv:2110.13586v13 citations
Originality Incremental advance
AI Analysis

This work addresses domain adaptation for acoustic scene classification, which is an incremental improvement for machine listening applications affected by microphone variations.

The paper tackled domain shift in acoustic scene classification by proposing a domain adaptation strategy using disentanglement learning to separate task-specific and domain-specific characteristics, but it showed only minor performance improvements when using training data from both domains.

The deployment of machine listening algorithms in real-life applications is often impeded by a domain shift caused for instance by different microphone characteristics. In this paper, we propose a novel domain adaptation strategy based on disentanglement learning. The goal is to disentangle task-specific and domain-specific characteristics in the analyzed audio recordings. In particular, we combine two strategies: First, we apply different binary masks to internal embedding representations and, second, we suggest a novel combination of categorical cross-entropy and variance-based losses. Our results confirm the disentanglement of both tasks on an embedding level but show only minor improvement in the acoustic scene classification performance, when training data from both domains can be used. As a second finding, we can confirm the effectiveness of a state-of-the-art unsupervised domain adaptation strategy, which performs across-domain adaptation on a feature-level instead.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes