A two-step backward compatible fullband speech enhancement system
This work addresses the need for improved speech enhancement at higher sample rates for audio applications, but it is incremental as it builds on existing deep learning methods.
The paper tackles the problem of fullband (48kHz) speech enhancement by proposing a two-step system that ensures high-quality enhancement while maintaining backward compatibility with existing wideband (16kHz) systems, achieving good fullband speech enhancement quality.
Speech enhancement methods based on deep learning have surpassed traditional methods. While many of these new approaches are operating on the wideband (16kHz) sample rate, a new fullband (48kHz) speech enhancement system is proposed in this paper. Compared to the existing fullband systems that utilizes perceptually motivated features to train the fullband speech enhancement using a single network structure, the proposed system is a two-step system ensuring good fullband speech enhancement quality while backward compatible to the existing wideband systems.