ASSDOct 5, 2021

Late reverberation suppression using U-nets

arXiv:2110.02144v11 citations
Originality Incremental advance
AI Analysis

This addresses speech enhancement for applications like speech recognition and audio compression, but it is incremental as it builds on existing U-net architectures.

The paper tackles speech dereverberation in real-world settings by proposing a U-net-based method for Late Reverberation Suppression, showing improvements in quality and intelligibility metrics compared to original U-nets and competitive performance with state-of-the-art GAN-based approaches.

In real-world settings, speech signals are almost always affected by reverberation produced by the working environment; these corrupted signals need to be \emph{dereverberated} prior to performing, e.g., speech recognition, speech-to-text conversion, compression, or general audio enhancement. In this paper, we propose a supervised dereverberation technique using \emph{U-nets with skip connections}, which are fully-convolutional encoder-decoder networks with layers arranged in the form of an "U" and connections that "skip" some layers. Building on this architecture, we address speech dereverberation through the lens of Late Reverberation Suppression (LS). Via experiments on synthetic and real-world data with different noise levels and reverberation settings, we show that our proposed method termed "LS U-net" improves quality, intelligibility and other performance metrics compared to the original U-net method and it is on par with the state-of-the-art GAN-based approaches.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes