CLFeb 16, 2024

Conversational SimulMT: Efficient Simultaneous Translation with Large Language Models

arXiv:2402.10552v49 citationsh-index: 44IWSLT
Originality Incremental advance
AI Analysis

This addresses efficiency issues in LLM-based SimulMT for real-time translation applications, representing an incremental improvement.

The paper tackles the trade-off between translation quality and latency in simultaneous machine translation (SimulMT) by proposing a conversational SimulMT framework using multi-turn-dialogue-based decoding with Llama2-7b-chat, achieving superior translation quality and comparable computational latency to specialized models on two benchmarks.

Simultaneous machine translation (SimulMT) presents a challenging trade-off between translation quality and latency. Recent studies have shown that LLMs can achieve good performance in SimulMT tasks. However, this often comes at the expense of high inference cost and latency. In this paper, we propose a conversational SimulMT framework to enhance the inference efficiency of LLM-based SimulMT through multi-turn-dialogue-based decoding. Our experiments with Llama2-7b-chat on two SimulMT benchmarks demonstrate the superiority of LLM in translation quality while achieving comparable computational latency to specialized SimulMT models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes