ASAISep 18, 2025

Listening, Imagining & Refining: A Heuristic Optimized ASR Correction Framework with LLMs

arXiv:2509.15095v23 citationsh-index: 17
Originality Incremental advance
AI Analysis

It addresses transcription accuracy for ASR users, but appears incremental as it builds on existing LLM-based correction methods.

The paper tackles ASR error correction by proposing LIR-ASR, a heuristic optimized iterative framework using LLMs, which reduces CER/WER by up to 1.5 percentage points on English and Chinese data.

Automatic Speech Recognition (ASR) systems remain prone to errors that affect downstream applications. In this paper, we propose LIR-ASR, a heuristic optimized iterative correction framework using LLMs, inspired by human auditory perception. LIR-ASR applies a "Listening-Imagining-Refining" strategy, generating phonetic variants and refining them in context. A heuristic optimization with finite state machine (FSM) is introduced to prevent the correction process from being trapped in local optima and rule-based constraints help maintain semantic fidelity. Experiments on both English and Chinese ASR outputs show that LIR-ASR achieves average reductions in CER/WER of up to 1.5 percentage points compared to baselines, demonstrating substantial accuracy gains in transcription.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes