CLJun 9, 2025

DeRAGEC: Denoising Named Entity Candidates with Synthetic Rationale for ASR Error Correction

Solee Im, Wonjun Lee, Jinmyeong An, Yunsu Kim, Jungseul Ok, Gary Geunbae Lee

arXiv:2506.07510v14.93 citationsh-index: 15Has CodeACL

Originality Incremental advance

AI Analysis

This work addresses ASR error correction for named entities, which is an incremental improvement over existing methods like RAGEC.

The paper tackles the problem of improving Named Entity correction in Automatic Speech Recognition systems by introducing DeRAGEC, which uses synthetic denoising rationales to filter noisy candidates, resulting in a 28% relative reduction in Word Error Rate compared to baseline ASR without postprocessing.

We present DeRAGEC, a method for improving Named Entity (NE) correction in Automatic Speech Recognition (ASR) systems. By extending the Retrieval-Augmented Generative Error Correction (RAGEC) framework, DeRAGEC employs synthetic denoising rationales to filter out noisy NE candidates before correction. By leveraging phonetic similarity and augmented definitions, it refines noisy retrieved NEs using in-context learning, requiring no additional training. Experimental results on CommonVoice and STOP datasets show significant improvements in Word Error Rate (WER) and NE hit ratio, outperforming baseline ASR and RAGEC methods. Specifically, we achieved a 28% relative reduction in WER compared to ASR without postprocessing. Our source code is publicly available at: https://github.com/solee0022/deragec

View on arXiv PDF Code

Similar