CLSDASJun 21, 2023

NoRefER: a Referenceless Quality Metric for Automatic Speech Recognition via Semi-Supervised Language Model Fine-Tuning with Contrastive Learning

arXiv:2306.12577v15 citationsh-index: 13Has Code
Originality Incremental advance
AI Analysis

This addresses the need for efficient ASR evaluation for researchers and developers, though it is incremental as it builds on existing language models and contrastive learning techniques.

The paper tackles the problem of evaluating automatic speech recognition (ASR) systems without costly ground-truth transcripts by introducing NoRefER, a referenceless quality metric that uses semi-supervised language model fine-tuning with contrastive learning, achieving high correlation with reference-based metrics.

This paper introduces NoRefER, a novel referenceless quality metric for automatic speech recognition (ASR) systems. Traditional reference-based metrics for evaluating ASR systems require costly ground-truth transcripts. NoRefER overcomes this limitation by fine-tuning a multilingual language model for pair-wise ranking ASR hypotheses using contrastive learning with Siamese network architecture. The self-supervised NoRefER exploits the known quality relationships between hypotheses from multiple compression levels of an ASR for learning to rank intra-sample hypotheses by quality, which is essential for model comparisons. The semi-supervised version also uses a referenced dataset to improve its inter-sample quality ranking, which is crucial for selecting potentially erroneous samples. The results indicate that NoRefER correlates highly with reference-based metrics and their intra-sample ranks, indicating a high potential for referenceless ASR evaluation or a/b testing.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes