CLJun 12, 2022

Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for Simultaneous Speech Translation

Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

arXiv:2206.05807v332.6649 citationsh-index: 47Has Code

Originality Synthesis-oriented

AI Analysis

This addresses evaluation bias in simultaneous speech translation, which is incremental but important for accurate latency assessment in real-time applications.

The paper identifies that Average Lagging (AL) underestimates latency for simultaneous speech translation systems that over-generate, and proposes LAAL, a modified metric that corrects this bias, showing improvements in evaluation fairness.

Simultaneous speech translation (SimulST) systems aim at generating their output with the lowest possible latency, which is normally computed in terms of Average Lagging (AL). In this paper we highlight that, despite its widespread adoption, AL provides underestimated scores for systems that generate longer predictions compared to the corresponding references. We also show that this problem has practical relevance, as recent SimulST systems have indeed a tendency to over-generate. As a solution, we propose LAAL (Length-Adaptive Average Lagging), a modified version of the metric that takes into account the over-generation phenomenon and allows for unbiased evaluation of both under-/over-generating systems.

View on arXiv PDF Code

Similar