CLOct 27, 2022

TRScore: A Novel GPT-based Readability Scorer for ASR Segmentation and Punctuation model evaluation and selection

Microsoft
arXiv:2210.15104v11 citationsh-index: 13
Originality Highly original
AI Analysis

This addresses the challenge of expensive and variable human evaluation for ASR readability, particularly in conversational speech, by providing an automated metric.

The paper tackles the problem of evaluating readability in ASR segmentation and punctuation models, which traditional metrics like F1 scores fail to capture well, by introducing TRScore, a GPT-based readability measure that shows strong correlation with human readability scores (Pearson's correlation of 0.98) and eliminates the need for human transcriptions for model selection.

Punctuation and Segmentation are key to readability in Automatic Speech Recognition (ASR), often evaluated using F1 scores that require high-quality human transcripts and do not reflect readability well. Human evaluation is expensive, time-consuming, and suffers from large inter-observer variability, especially in conversational speech devoid of strict grammatical structures. Large pre-trained models capture a notion of grammatical structure. We present TRScore, a novel readability measure using the GPT model to evaluate different segmentation and punctuation systems. We validate our approach with human experts. Additionally, our approach enables quantitative assessment of text post-processing techniques such as capitalization, inverse text normalization (ITN), and disfluency on overall readability, which traditional word error rate (WER) and slot error rate (SER) metrics fail to capture. TRScore is strongly correlated to traditional F1 and human readability scores, with Pearson's correlation coefficients of 0.67 and 0.98, respectively. It also eliminates the need for human transcriptions for model selection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes