AS AISep 18, 2025

Listening, Imagining & Refining: A Heuristic Optimized ASR Correction Framework with LLMs

Yutong Liu, Ziyue Zhang, Cheng Huang, Yongbin Yu, Xiangxiang Wang, Yuqing Cai, Nyima Tashi

arXiv:2509.15095v24.33 citationsh-index: 17

Originality Incremental advance

AI Analysis

It addresses transcription accuracy for ASR users, but appears incremental as it builds on existing LLM-based correction methods.

The paper tackles ASR error correction by proposing LIR-ASR, a heuristic optimized iterative framework using LLMs, which reduces CER/WER by up to 1.5 percentage points on English and Chinese data.

Automatic Speech Recognition (ASR) systems remain prone to errors that affect downstream applications. In this paper, we propose LIR-ASR, a heuristic optimized iterative correction framework using LLMs, inspired by human auditory perception. LIR-ASR applies a "Listening-Imagining-Refining" strategy, generating phonetic variants and refining them in context. A heuristic optimization with finite state machine (FSM) is introduced to prevent the correction process from being trapped in local optima and rule-based constraints help maintain semantic fidelity. Experiments on both English and Chinese ASR outputs show that LIR-ASR achieves average reductions in CER/WER of up to 1.5 percentage points compared to baselines, demonstrating substantial accuracy gains in transcription.

View on arXiv PDF

Similar