CL ASJul 21, 2023

MeetEval: A Toolkit for Computation of Word Error Rates for Meeting Transcription Systems

Thilo von Neumann, Christoph Boeddeker, Marc Delcroix, Reinhold Haeb-Umbach

arXiv:2307.11394v37.347 citationsh-index: 40Has Code

Originality Synthesis-oriented

AI Analysis

This provides a standardized evaluation tool for researchers and developers working on meeting transcription, but it is incremental as it builds on existing WER definitions.

The authors tackled the problem of evaluating meeting transcription systems by introducing MeetEval, a toolkit that computes Word Error Rates (WERs) with a temporal constraint to improve alignment quality, resulting in a speedup of the matching algorithm.

MeetEval is an open-source toolkit to evaluate all kinds of meeting transcription systems. It provides a unified interface for the computation of commonly used Word Error Rates (WERs), specifically cpWER, ORC-WER and MIMO-WER along other WER definitions. We extend the cpWER computation by a temporal constraint to ensure that only words are identified as correct when the temporal alignment is plausible. This leads to a better quality of the matching of the hypothesis string to the reference string that more closely resembles the actual transcription quality, and a system is penalized if it provides poor time annotations. Since word-level timing information is often not available, we present a way to approximate exact word-level timings from segment-level timings (e.g., a sentence) and show that the approximation leads to a similar WER as a matching with exact word-level annotations. At the same time, the time constraint leads to a speedup of the matching algorithm, which outweighs the additional overhead caused by processing the time stamps.

View on arXiv PDF Code

Similar