CLJun 9, 2025

DeRAGEC: Denoising Named Entity Candidates with Synthetic Rationale for ASR Error Correction

arXiv:2506.07510v12 citationsh-index: 15Has CodeACL
Originality Incremental advance
AI Analysis

This work addresses ASR error correction for named entities, which is an incremental improvement over existing methods like RAGEC.

The paper tackles the problem of improving Named Entity correction in Automatic Speech Recognition systems by introducing DeRAGEC, which uses synthetic denoising rationales to filter noisy candidates, resulting in a 28% relative reduction in Word Error Rate compared to baseline ASR without postprocessing.

We present DeRAGEC, a method for improving Named Entity (NE) correction in Automatic Speech Recognition (ASR) systems. By extending the Retrieval-Augmented Generative Error Correction (RAGEC) framework, DeRAGEC employs synthetic denoising rationales to filter out noisy NE candidates before correction. By leveraging phonetic similarity and augmented definitions, it refines noisy retrieved NEs using in-context learning, requiring no additional training. Experimental results on CommonVoice and STOP datasets show significant improvements in Word Error Rate (WER) and NE hit ratio, outperforming baseline ASR and RAGEC methods. Specifically, we achieved a 28% relative reduction in WER compared to ASR without postprocessing. Our source code is publicly available at: https://github.com/solee0022/deragec

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes