Controllable Multichannel Speech Dereverberation based on Deep Neural Networks
This addresses the issue of speech quality degradation in reverberant environments for applications like hearing aids or speech recognition, offering a flexible solution, though it appears incremental by adding controllability to existing neural network methods.
The paper tackles the problem of speech dereverberation by proposing a deep neural network algorithm that allows controllable dereverberation levels, enabling recovery of both direct sound and early reflections, with efficacy confirmed in simulated conditions using spatially distributed microphones.
Neural network based speech dereverberation has achieved promising results in recent studies. Nevertheless, many are focused on recovery of only the direct path sound and early reflections, which could be beneficial to speech perception, are discarded. The performance of a model trained to recover clean speech degrades when evaluated on early reverberation targets, and vice versa. This paper proposes a novel deep neural network based multichannel speech dereverberation algorithm, in which the dereverberation level is controllable. This is realized by adding a simple floating-point number as target controller of the model. Experiments are conducted using spatially distributed microphones, and the efficacy of the proposed algorithm is confirmed in various simulated conditions.