ASSDFeb 14, 2020

Sound Event Localization based on Sound Intensity Vector Refined By DNN-Based Denoising and Source Separation

arXiv:2002.05994v139 citations
AI Analysis

This work addresses sound event localization for applications like audio surveillance or robotics, but it is incremental as it builds on existing physics-based and DNN-based methods.

The paper tackles the problem of direction-of-arrival estimation for sound event localization and detection by combining physics-based and DNN-based approaches to refine sound intensity vectors, achieving state-of-the-art accuracy on an open dataset for both single and overlapping sources.

We propose a direction-of-arrival (DOA) estimation method for Sound Event Localization and Detection (SELD). Direct estimation of DOA using a deep neural network (DNN), i.e. completely-datadriven approach, achieves high accuracy. However, there is a gap in the accuracy between DOA estimation for single and overlapping sources because they cannot incorporate physical knowledge. Meanwhile, although the accuracy of physics-based approaches is inferior to DNN-based approaches, it is robust for overlapping source. In this study, we consider a combination of physics-based and DNN-based approaches; the sound intensity vectors (IVs) for physics-based DOA estimation is refined based on DNN-based denoising and source separation. This method enables the accurate DOA estimation for both single and overlapping sources using a spherical microphone array. Experimental results show that the proposed method achieves state-of-the-art DOA estimation accuracy on an open dataset of the SELD.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes