SDAIASFeb 5, 2021

Real-time Denoising and Dereverberation with Tiny Recurrent U-Net

arXiv:2102.03207v378 citations
AI Analysis

This work addresses the problem of deploying high-performing speech enhancement models on edge devices for real-world applications, offering a practical solution for developers and users of such devices.

This paper introduces Tiny Recurrent U-Net (TRU-Net), a lightweight online inference model for speech enhancement. It achieves competitive performance with state-of-the-art models while being significantly smaller, with a quantized size of 362 kilobytes.

Modern deep learning-based models have seen outstanding performance improvement with speech enhancement tasks. The number of parameters of state-of-the-art models, however, is often too large to be deployed on devices for real-world applications. To this end, we propose Tiny Recurrent U-Net (TRU-Net), a lightweight online inference model that matches the performance of current state-of-the-art models. The size of the quantized version of TRU-Net is 362 kilobytes, which is small enough to be deployed on edge devices. In addition, we combine the small-sized model with a new masking method called phase-aware $β$-sigmoid mask, which enables simultaneous denoising and dereverberation. Results of both objective and subjective evaluations have shown that our model can achieve competitive performance with the current state-of-the-art models on benchmark datasets using fewer parameters by orders of magnitude.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes