CLAISDASJan 22, 2025

FlanEC: Exploring Flan-T5 for Post-ASR Error Correction

Georgia Tech
arXiv:2501.12979v15 citationsh-index: 33SLT
Originality Synthesis-oriented
AI Analysis

This work addresses error correction in ASR transcriptions for applications requiring high accuracy, but it is incremental as it applies an existing model to a specific task.

The paper tackles the problem of improving Automatic Speech Recognition (ASR) outputs by using Flan-T5 for post-ASR error correction, resulting in enhanced linguistic correctness and grammaticality, with evaluation on the HyPoradise dataset showing effectiveness in this domain.

In this paper, we present an encoder-decoder model leveraging Flan-T5 for post-Automatic Speech Recognition (ASR) Generative Speech Error Correction (GenSEC), and we refer to it as FlanEC. We explore its application within the GenSEC framework to enhance ASR outputs by mapping n-best hypotheses into a single output sentence. By utilizing n-best lists from ASR models, we aim to improve the linguistic correctness, accuracy, and grammaticality of final ASR transcriptions. Specifically, we investigate whether scaling the training data and incorporating diverse datasets can lead to significant improvements in post-ASR error correction. We evaluate FlanEC using the HyPoradise dataset, providing a comprehensive analysis of the model's effectiveness in this domain. Furthermore, we assess the proposed approach under different settings to evaluate model scalability and efficiency, offering valuable insights into the potential of instruction-tuned encoder-decoder models for this task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes