Wait-info Policy: Balancing Source and Target at Information Level for Simultaneous Machine Translation
This work addresses the challenge of real-time translation for applications like live captioning, but it is incremental as it builds on existing token-level methods by shifting to an information-level approach.
The paper tackles the problem of balancing source and target information in simultaneous machine translation by proposing a Wait-info Policy that operates at the information level, quantifying token information as 'info' and making decisions based on comparisons; experiments show it outperforms strong baselines and achieves better balance.
Simultaneous machine translation (SiMT) outputs the translation while receiving the source inputs, and hence needs to balance the received source information and translated target information to make a reasonable decision between waiting for inputs or outputting translation. Previous methods always balance source and target information at the token level, either directly waiting for a fixed number of tokens or adjusting the waiting based on the current token. In this paper, we propose a Wait-info Policy to balance source and target at the information level. We first quantify the amount of information contained in each token, named info. Then during simultaneous translation, the decision of waiting or outputting is made based on the comparison results between the total info of previous target outputs and received source inputs. Experiments show that our method outperforms strong baselines under and achieves better balance via the proposed info.