LGCLASAug 7, 2025

REINA: Regularized Entropy Information-Based Loss for Efficient Simultaneous Speech Translation

arXiv:2508.04946v21 citationsh-index: 13Has Code
Originality Highly original
AI Analysis

This work addresses the problem of efficient real-time translation for applications like live captioning or interpretation, representing an incremental improvement over prior methods.

The paper tackles the challenge of balancing translation quality and latency in Simultaneous Speech Translation by introducing REINA, a novel loss function based on information theory, which improves the latency/quality trade-off by up to 21% and achieves state-of-the-art streaming results on multiple language pairs.

Simultaneous Speech Translation (SimulST) systems stream in audio while simultaneously emitting translated text or speech. Such systems face the significant challenge of balancing translation quality and latency. We introduce a strategy to optimize this tradeoff: wait for more input only if you gain information by doing so. Based on this strategy, we present Regularized Entropy INformation Adaptation (REINA), a novel loss to train an adaptive policy using an existing non-streaming translation model. We derive REINA from information theory principles and show that REINA helps push the reported Pareto frontier of the latency/quality tradeoff over prior works. Utilizing REINA, we train a SimulST model on French, Spanish and German, both from and into English. Training on only open source or synthetically generated data, we achieve state-of-the-art (SOTA) streaming results for models of comparable size. We also introduce a metric for streaming efficiency, quantitatively showing REINA improves the latency/quality trade-off by as much as 21% compared to prior approaches, normalized against non-streaming baseline BLEU scores.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes