LGSDASSep 12, 2023

CleanUNet 2: A Hybrid Speech Denoising Model on Waveform and Spectrogram

arXiv:2309.05975v114 citationsh-index: 59
Originality Incremental advance
AI Analysis

This work addresses speech denoising for audio processing applications, presenting an incremental improvement over existing state-of-the-art methods.

The paper tackles speech denoising by combining waveform and spectrogram denoisers in a two-stage model, achieving improved performance over previous methods in objective and subjective evaluations.

In this work, we present CleanUNet 2, a speech denoising model that combines the advantages of waveform denoiser and spectrogram denoiser and achieves the best of both worlds. CleanUNet 2 uses a two-stage framework inspired by popular speech synthesis methods that consist of a waveform model and a spectrogram model. Specifically, CleanUNet 2 builds upon CleanUNet, the state-of-the-art waveform denoiser, and further boosts its performance by taking predicted spectrograms from a spectrogram denoiser as the input. We demonstrate that CleanUNet 2 outperforms previous methods in terms of various objective and subjective evaluations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes