CLApr 22

Hierarchical Policy Optimization for Simultaneous Translation of Unbounded Speech

CMU
arXiv:2604.2104585.1h-index: 18Has Code
AI Analysis

For practitioners of simultaneous speech translation, this work reduces computational overhead while improving translation quality, though it is incremental as it builds on existing multi-turn dialogue reformulation.

The paper tackles the high computational cost of simultaneous speech translation (SST) with LLMs by proposing Hierarchical Policy Optimization (HPO) to post-train models on imperfect supervised fine-tuning data. HPO achieves over +7 COMET and +1.25 MetricX score improvements at 1.5 seconds latency on English-to-Chinese/German/Japanese.

Simultaneous speech translation (SST) generates translations while receiving partial speech input. Recent advances show that large language models (LLMs) can substantially improve SST quality, but at the cost of high computational overhead. To reduce this cost, prior work reformulates SST as a multi-turn dialogue task, enabling full reuse of the LLM's key-value (KV) cache and eliminating redundant feature recomputation. However, this approach relies on supervised fine-tuning (SFT) data in dialogue form, for which few human annotations exist, and existing synthesis methods cannot guarantee data quality. In this work, we propose a Hierarchical Policy Optimization (HPO) approach that post-train models trained on imperfect SFT data. We introduce a hierarchical reward that balances translation quality and latency objectives. Experiments on English to Chinese/German/Japanese demonstrate improvements of over +7 COMET score and +1.25 MetricX score at a latency of 1.5 seconds. Comprehensive ablation studies further validate the effectiveness of different quality rewards, hierarchical reward formulations, and segmentation strategies. Code can be found here https://github.com/owaski/HPO

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes