Multiple Sound Source Localisation with Steered Response Power Density and Hierarchical Grid Refinement
This work addresses computational efficiency in sound source localization for applications like audio analysis, though it appears incremental as an extension to existing SRP methods.
The paper tackled the problem of estimating the direction-of-arrival of multiple sound sources, which is computationally costly with existing methods, by introducing steered response power density and hierarchical grid refinement to reduce steering directions. The result showed robustness to reverberation and noise, with evaluations indicating competitive performance against state-of-the-art methods.
Estimation of the direction-of-arrival (DOA) of sound sources is an important step in sound field analysis. Rigid spherical microphone arrays allow the calculation of a compact spherical harmonic representation of the sound field. A basic method for analysing sound fields recorded using such arrays is steered response power (SRP) maps wherein the source DOA can be estimated as the steering direction that maximises the output power of a maximally-directive beam. This approach is computationally costly since it requires steering the beam in all possible directions. This paper presents an extension to SRP called steered response power density (SRPD) and an associated, signal-adaptive search method called hierarchical grid refinement (HiGRID) for reducing the number of steering directions needed for DOA estimation. The proposed method can localise coherent as well as incoherent sources while jointly providing the number of prominent sources in the scene. It is shown to be robust to reverberation and additive white noise. An evaluation of the proposed method using simulations and real recordings under highly reverberant conditions as well as a comparison with state- of-the-art methods are presented.