SDASJun 22, 2021

Learning to Inference with Early Exit in the Progressive Speech Enhancement

arXiv:2106.11730v19 citations
Originality Incremental advance
AI Analysis

This work addresses the need for flexible inference speed in speech enhancement for real-world applications, representing an incremental improvement over existing methods.

The paper tackles the problem of controlling inference speed in speech enhancement systems by proposing a stage-wise adaptive inference approach with early exit, which accelerates inference when spectral distance between stages falls below a threshold. Results show superiority over state-of-the-art baselines on TIMIT corpus in terms of PESQ, ESTOI, and DNSMOS metrics, with the ability to adjust efficiency while maintaining performance.

In real scenarios, it is often necessary and significant to control the inference speed of speech enhancement systems under different conditions. To this end, we propose a stage-wise adaptive inference approach with early exit mechanism for progressive speech enhancement. Specifically, in each stage, once the spectral distance between adjacent stages lowers the empirically preset threshold, the inference will terminate and output the estimation, which can effectively accelerate the inference speed. To further improve the performance of existing speech enhancement systems, PL-CRN++ is proposed, which is an improved version over our preliminary work PL-CRN and combines stage recurrent mechanism and complex spectral mapping. Extensive experiments are conducted on the TIMIT corpus, the results demonstrate the superiority of our system over state-of-the-art baselines in terms of PESQ, ESTOI and DNSMOS. Moreover, by adjusting the threshold, we can easily control the inference efficiency while sustaining the system performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes