SD AI ASFeb 5, 2021

Real-time Denoising and Dereverberation with Tiny Recurrent U-Net

Hyeong-Seok Choi, Sungjin Park, Jie Hwan Lee, Hoon Heo, Dongsuk Jeon, Kyogu Lee

arXiv:2102.03207v316.278 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of deploying high-performing speech enhancement models on edge devices for real-world applications, offering a practical solution for developers and users of such devices.

This paper introduces Tiny Recurrent U-Net (TRU-Net), a lightweight online inference model for speech enhancement. It achieves competitive performance with state-of-the-art models while being significantly smaller, with a quantized size of 362 kilobytes.

Modern deep learning-based models have seen outstanding performance improvement with speech enhancement tasks. The number of parameters of state-of-the-art models, however, is often too large to be deployed on devices for real-world applications. To this end, we propose Tiny Recurrent U-Net (TRU-Net), a lightweight online inference model that matches the performance of current state-of-the-art models. The size of the quantized version of TRU-Net is 362 kilobytes, which is small enough to be deployed on edge devices. In addition, we combine the small-sized model with a new masking method called phase-aware $β$-sigmoid mask, which enables simultaneous denoising and dereverberation. Results of both objective and subjective evaluations have shown that our model can achieve competitive performance with the current state-of-the-art models on benchmark datasets using fewer parameters by orders of magnitude.

View on arXiv PDF

Similar