Dependence on Early and Late Reverberation of Single-Channel Speaker Distance Estimation

arXiv:2605.0769454.1
AI Analysis

For researchers working on acoustic distance estimation, this work clarifies the dependence on RIR components and calibration conditions, but the findings are incremental as they confirm known cues.

The study investigates which components of the room impulse response (RIR) are used by a single-channel speaker distance estimation model, finding that without time calibration the model relies on early reflections and achieves a mean absolute error (MAE) of 1.29 m, while with time calibration it achieves 0.14 m MAE by using propagation delay alone.

Single-channel speaker distance estimation has recently achieved centimeter-level accuracy in simulated environments, yet it remains unclear which components of the room impulse response (RIR) the model exploits and how performance depends on the recording conditions. In this work, we decompose simulated RIRs into four variants (full, direct-only, no-late, and no-early) using the mixing time estimated from the echo density function as the boundary between early reflections and late reverberation. We define four calibration scenarios, from fully calibrated (synchronised capture, known source level) to fully uncalibrated (arbitrary onset, unknown level), and evaluate all combinations on a matched dataset. Results show that without time calibration, mean absolute error (MAE) increases to $1.29$ m and the model extracts reverberation-based cues, with early reflections emerging as the most informative component. Further analysis against DRR, $C_{50}$, and $T_{60}$ confirms that estimation accuracy improves with stronger early energy and degrades in highly reverberant environments. When time calibration is available, the model achieves a MAE of $0.14$ m by extracting the propagation delay alone, regardless of the RIR content.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes