AS SDOct 5, 2021

Late reverberation suppression using U-nets

arXiv:2110.02144v12.31 citations

Originality Incremental advance

AI Analysis

This addresses speech enhancement for applications like speech recognition and audio compression, but it is incremental as it builds on existing U-net architectures.

The paper tackles speech dereverberation in real-world settings by proposing a U-net-based method for Late Reverberation Suppression, showing improvements in quality and intelligibility metrics compared to original U-nets and competitive performance with state-of-the-art GAN-based approaches.

In real-world settings, speech signals are almost always affected by reverberation produced by the working environment; these corrupted signals need to be \emph{dereverberated} prior to performing, e.g., speech recognition, speech-to-text conversion, compression, or general audio enhancement. In this paper, we propose a supervised dereverberation technique using \emph{U-nets with skip connections}, which are fully-convolutional encoder-decoder networks with layers arranged in the form of an "U" and connections that "skip" some layers. Building on this architecture, we address speech dereverberation through the lens of Late Reverberation Suppression (LS). Via experiments on synthetic and real-world data with different noise levels and reverberation settings, we show that our proposed method termed "LS U-net" improves quality, intelligibility and other performance metrics compared to the original U-net method and it is on par with the state-of-the-art GAN-based approaches.

View on arXiv PDF

Similar