SDASMay 5, 2021

Acoustic Scene Classification Using Multichannel Observation with Partially Missing Channels

arXiv:2105.01836v1
Originality Synthesis-oriented
AI Analysis

This work addresses a practical issue for applications using distributed microphone arrays, but it is incremental as it builds on existing methods with specific data augmentation.

The paper tackles the problem of acoustic scene classification with multichannel audio recordings that have partially missing channels, such as from smartphones or IoT devices, and proposes simple data augmentation methods to address performance degradation, achieving improved classification results as evaluated in their study.

Sounds recorded with smartphones or IoT devices often have partially unreliable observations caused by clipping, wind noise, and completely missing parts due to microphone failure and packet loss in data transmission over the network. In this paper, we investigate the impact of the partially missing channels on the performance of acoustic scene classification using multichannel audio recordings, especially for a distributed microphone array. Missing observations cause not only losses of time-frequency and spatial information on sound sources but also a mismatch between a trained model and evaluation data. We thus investigate how a missing channel affects the performance of acoustic scene classification in detail. We also propose simple data augmentation methods for scene classification using multichannel observations with partially missing channels and evaluate the scene classification performance using the data augmentation methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes