CLSDASAug 28, 2024

Beyond Levenshtein: Leveraging Multiple Algorithms for Robust Word Error Rate Computations And Granular Error Classifications

arXiv:2408.15616v12 citationsh-index: 4Has Code
Originality Incremental advance
AI Analysis

This work addresses a specific problem for ASR researchers and practitioners by providing more granular error analysis, though it is incremental as it builds on existing algorithms.

The paper tackles the loss of orthographic information in standard Word Error Rate (WER) computations for Automatic Speech Recognition by introducing a non-destructive, token-based approach that uses an extended Levenshtein algorithm to compute robust WER and additional metrics like punctuation error rate, with evaluation showing practical equivalence to common methods.

The Word Error Rate (WER) is the common measure of accuracy for Automatic Speech Recognition (ASR). Transcripts are usually pre-processed by substituting specific characters to account for non-semantic differences. As a result of this normalisation, information on the accuracy of punctuation or capitalisation is lost. We present a non-destructive, token-based approach using an extended Levenshtein distance algorithm to compute a robust WER and additional orthographic metrics. Transcription errors are also classified more granularly by existing string similarity and phonetic algorithms. An evaluation on several datasets demonstrates the practical equivalence of our approach compared to common WER computations. We also provide an exemplary analysis of derived use cases, such as a punctuation error rate, and a web application for interactive use and visualisation of our implementation. The code is available open-source.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes