SDSep 30, 2016

Hearing in a shoe-box : binaural source position and wall absorption estimation using virtually supervised learning

arXiv:1609.09747v210 citations
AI Analysis

This work addresses sound source localization and room acoustics estimation for applications like audio processing or robotics, presenting a novel framework but with incremental methodological advances.

The paper tackled the problem of estimating sound source position and wall absorption from binaural audio using a virtually-supervised learning framework, achieving successful estimation of azimuth, elevation, range, and absorption coefficients based on binaural signals, with results showing that incorporating random-diffusion effects improves parameter estimation.

This paper introduces a new framework for supervised sound source localization referred to as virtually-supervised learning. An acoustic shoe-box room simulator is used to generate a large number of binaural single-source audio scenes. These scenes are used to build a dataset of spatial binaural features annotated with acoustic properties such as the 3D source position and the walls' absorption coefficients. A probabilistic high- to low-dimensional regression framework is used to learn a mapping from these features to the acoustic properties. Results indicate that this mapping successfully estimates the azimuth and elevation of new sources, but also their range and even the walls' absorption coefficients solely based on binaural signals. Results also reveal that incorporating random-diffusion effects in the data significantly improves the estimation of all parameters.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes