CLAIDec 22, 2023

Language Model is a Branch Predictor for Simultaneous Machine Translation

arXiv:2312.14488v12 citationsh-index: 11Has CodeICASSP
Originality Incremental advance
AI Analysis

This work addresses latency reduction for simultaneous machine translation systems, presenting an incremental improvement by adapting CPU branch prediction techniques to this domain.

The paper tackles the problem of reducing latency in simultaneous machine translation by proposing a method that uses a language model as a branch predictor to forecast future source words, enabling early decoding and correction when predictions are wrong. Experimental results show improvements in both translation quality and latency.

The primary objective of simultaneous machine translation (SiMT) is to minimize latency while preserving the quality of the final translation. Drawing inspiration from CPU branch prediction techniques, we propose incorporating branch prediction techniques in SiMT tasks to reduce translation latency. Specifically, we utilize a language model as a branch predictor to predict potential branch directions, namely, future source words. Subsequently, we utilize the predicted source words to decode the output in advance. When the actual source word deviates from the predicted source word, we use the real source word to decode the output again, replacing the predicted output. To further reduce computational costs, we share the parameters of the encoder and the branch predictor, and utilize a pre-trained language model for initialization. Our proposed method can be seamlessly integrated with any SiMT model. Extensive experimental results demonstrate that our approach can improve translation quality and latency at the same time. Our code is available at https://github.com/YinAoXiong/simt_branch_predictor .

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes