ASAIMay 24, 2025

TS-URGENet: A Three-stage Universal Robust and Generalizable Speech Enhancement Network

arXiv:2505.18533v15 citationsh-index: 7INTERSPEECH
Originality Synthesis-oriented
AI Analysis

This work addresses speech enhancement for diverse distortions, but appears incremental as it builds on existing multi-stage architectures.

The paper tackled universal speech enhancement for handling various distortions and input formats by proposing TS-URGENet, a three-stage network that achieved 2nd place in the Interspeech 2025 URGENT Challenge Track 1.

Universal speech enhancement aims to handle input speech with different distortions and input formats. To tackle this challenge, we present TS-URGENet, a Three-Stage Universal, Robust, and Generalizable speech Enhancement Network. To address various distortions, the proposed system employs a novel three-stage architecture consisting of a filling stage, a separation stage, and a restoration stage. The filling stage mitigates packet loss by preliminarily filling lost regions under noise interference, ensuring signal continuity. The separation stage suppresses noise, reverberation, and clipping distortion to improve speech clarity. Finally, the restoration stage compensates for bandwidth limitation, codec artifacts, and residual packet loss distortion, refining the overall speech quality. Our proposed TS-URGENet achieved outstanding performance in the Interspeech 2025 URGENT Challenge, ranking 2nd in Track 1.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes