ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correction
This work addresses rare word recognition issues in ASR, which can negatively impact downstream tasks like keyword spotting and intent detection, representing an incremental improvement with specific gains.
The paper tackles the problem of rare word recognition errors in automatic speech recognition (ASR) systems by proposing a postprocessing method based on error detection and context-aware correction, achieving significantly lower word error rates across five datasets while maintaining reasonable inference speed.
Automatic speech recognition (ASR) systems often encounter difficulties in accurately recognizing rare words, leading to errors that can have a negative impact on downstream tasks such as keyword spotting, intent detection, and text summarization. To address this challenge, we present a novel ASR postprocessing method that focuses on improving the recognition of rare words through error detection and context-aware error correction. Our method optimizes the decoding process by targeting only the predicted error positions, minimizing unnecessary computations. Moreover, we leverage a rare word list to provide additional contextual knowledge, enabling the model to better correct rare words. Experimental results across five datasets demonstrate that our proposed method achieves significantly lower word error rates (WERs) than previous approaches while maintaining a reasonable inference speed. Furthermore, our approach exhibits promising robustness across different ASR systems.