Domain Mismatch Robust Acoustic Scene Classification using Channel Information Conversion
This addresses the device channel mismatch issue for real-world implementation in acoustic scene classification, representing an incremental improvement.
The paper tackles the problem of channel mismatch in acoustic scene classification by proposing a channel domain conversion method using a factorized hierarchical variational autoencoder, which adapts source and target domains to a pre-defined specific domain without requiring domain relationships or individual domain information, and shows it can mitigate channel mismatching issues in experiments on the IEEE DCASE 2018 task 1-B dataset.
In a recent acoustic scene classification (ASC) research field, training and test device channel mismatch have become an issue for the real world implementation. To address the issue, this paper proposes a channel domain conversion using factorized hierarchical variational autoencoder. Proposed method adapts both the source and target domain to a pre-defined specific domain. Unlike the conventional approach, the relationship between the target and source domain and information of each domain are not required in the adaptation process. Based on the experimental results using the IEEE detection and classification of acoustic scenes and event 2018 task 1-B dataset and the baseline system, it is shown that the proposed approach can mitigate the channel mismatching issue of different recording devices.