LGNov 29, 2017

Reinforcement Learning To Adapt Speech Enhancement to Instantaneous Input Signal Quality

Rasool Fakoor, Xiaodong He, Ivan Tashev, Shuayb Zarar

arXiv:1711.10791v23.711 citations

Originality Incremental advance

AI Analysis

This addresses the need for more robust speech enhancement across varying input conditions, representing an incremental advance in adaptive noise suppression.

The paper tackles the problem of noise-suppression algorithms being limited to specific input signal-to-noise ratios by using reinforcement learning to dynamically adapt algorithmic parameters, resulting in 42% and 16% improvements in output SNR and MSE, respectively.

Today, the optimal performance of existing noise-suppression algorithms, both data-driven and those based on classic statistical methods, is range bound to specific levels of instantaneous input signal-to-noise ratios. In this paper, we present a new approach to improve the adaptivity of such algorithms enabling them to perform robustly across a wide range of input signal and noise types. Our methodology is based on the dynamic control of algorithmic parameters via reinforcement learning. Specifically, we model the noise-suppression module as a black box, requiring no knowledge of the algorithmic mechanics except a simple feedback from the output. We utilize this feedback as the reward signal for a reinforcement-learning agent that learns a policy to adapt the algorithmic parameters for every incoming audio frame (16 ms of data). Our preliminary results show that such a control mechanism can substantially increase the overall performance of the underlying noise-suppression algorithm; 42% and 16% improvements in output SNR and MSE, respectively, when compared to no adaptivity.

View on arXiv PDF

Similar