CLAIJan 15

HOMURA: Taming the Sand-Glass for Time-Constrained LLM Translation via Reinforcement Learning

arXiv:2601.10187v2h-index: 5
Originality Highly original
AI Analysis

This addresses the verbosity bias in LLM translation for time-sensitive applications like subtitling and dubbing, representing a novel method for a known bottleneck.

The paper tackles the problem of LLM translation being too verbose for time-constrained tasks like subtitling by introducing HOMURA, a reinforcement learning framework that optimizes for semantic fidelity and temporal compliance, achieving significant improvements over baselines in precise length control.

Large Language Models (LLMs) have achieved remarkable strides in multilingual translation but are hindered by a systemic cross-lingual verbosity bias, rendering them unsuitable for strict time-constrained tasks like subtitling and dubbing. Current prompt-engineering approaches struggle to resolve this conflict between semantic fidelity and rigid temporal feasibility. To bridge this gap, we first introduce Sand-Glass, a benchmark specifically designed to evaluate translation under syllable-level duration constraints. Furthermore, we propose HOMURA, a reinforcement learning framework that explicitly optimizes the trade-off between semantic preservation and temporal compliance. By employing a KL-regularized objective with a novel dynamic syllable-ratio reward, HOMURA effectively "tames" the output length. Experimental results demonstrate that our method significantly outperforms strong LLM baselines, achieving precise length control that respects linguistic density hierarchies without compromising semantic adequacy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes