Dereverberation using joint estimation of dry speech signal and acoustic system
This addresses speech quality degradation in audio processing, but appears incremental as it builds on existing deep learning methods.
The authors tackled speech dereverberation by jointly estimating the dry speech signal and room impulse response, using deep learning models combined in a joint model with shared parameters.
The purpose of speech dereverberation is to remove quality-degrading effects of a time-invariant impulse response filter from the signal. In this report, we describe an approach to speech dereverberation that involves joint estimation of the dry speech signal and of the room impulse response. We explore deep learning models that apply to each task separately, and how these can be combined in a joint model with shared parameters.