SD ASJun 22, 2021

Learning to Inference with Early Exit in the Progressive Speech Enhancement

Andong Li, Chengshi Zheng, Lu Zhang, Xiaodong Li

arXiv:2106.11730v18.69 citations

Originality Incremental advance

AI Analysis

This work addresses the need for flexible inference speed in speech enhancement for real-world applications, representing an incremental improvement over existing methods.

The paper tackles the problem of controlling inference speed in speech enhancement systems by proposing a stage-wise adaptive inference approach with early exit, which accelerates inference when spectral distance between stages falls below a threshold. Results show superiority over state-of-the-art baselines on TIMIT corpus in terms of PESQ, ESTOI, and DNSMOS metrics, with the ability to adjust efficiency while maintaining performance.

In real scenarios, it is often necessary and significant to control the inference speed of speech enhancement systems under different conditions. To this end, we propose a stage-wise adaptive inference approach with early exit mechanism for progressive speech enhancement. Specifically, in each stage, once the spectral distance between adjacent stages lowers the empirically preset threshold, the inference will terminate and output the estimation, which can effectively accelerate the inference speed. To further improve the performance of existing speech enhancement systems, PL-CRN++ is proposed, which is an improved version over our preliminary work PL-CRN and combines stage recurrent mechanism and complex spectral mapping. Extensive experiments are conducted on the TIMIT corpus, the results demonstrate the superiority of our system over state-of-the-art baselines in terms of PESQ, ESTOI and DNSMOS. Moreover, by adjusting the threshold, we can easily control the inference efficiency while sustaining the system performance.

View on arXiv PDF

Similar