ASSDSep 29, 2021

A Universal Deep Room Acoustics Estimator

arXiv:2109.14436v114 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of speech audio quality degradation in acoustic environments for applications like audio processing, though it is incremental as it builds on existing neural network approaches.

The paper tackles the problem of estimating key room acoustic parameters without requiring Room Impulse Response measurements, using a convolutional recurrent neural network that outperforms state-of-the-art methods.

Speech audio quality is subject to degradation caused by an acoustic environment and isotropic ambient and point noises. The environment can lead to decreased speech intelligibility and loss of focus and attention by the listener. Basic acoustic parameters that characterize the environment well are (i) signal-to-noise ratio (SNR), (ii) speech transmission index, (iii) reverberation time, (iv) clarity, and (v) direct-to-reverberant ratio. Except for the SNR, these parameters are usually derived from the Room Impulse Response (RIR) measurements; however, such measurements are often not available. This work presents a universal room acoustic estimator design based on convolutional recurrent neural networks that estimate the acoustic environment measurement blindly and jointly. Our results indicate that the proposed system is robust to non-stationary signal variations and outperforms current state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes