CLDec 29, 2024

Enhancing Entertainment Translation for Indian Languages using Adaptive Context, Style and LLMs

Pratik Rakesh Singh, Mohammadi Zaki, Pankaj Wasnik

arXiv:2412.20440v11.02 citationsh-index: 4

Originality Incremental advance

AI Analysis

This work addresses the problem of producing high-quality translations for dubbing and subtitling in Indian languages, enabling broader content accessibility, though it appears incremental as it builds on existing LLM methods.

The paper tackles neural machine translation for entertainment content by proposing a novel framework that incorporates adaptive context and style estimation to guide LLMs, resulting in significant improvements in COMET scores and win-ratio over state-of-the-art LLMs.

We address the challenging task of neural machine translation (NMT) in the entertainment domain, where the objective is to automatically translate a given dialogue from a source language content to a target language. This task has various applications, particularly in automatic dubbing, subtitling, and other content localization tasks, enabling source content to reach a wider audience. Traditional NMT systems typically translate individual sentences in isolation, without facilitating knowledge transfer of crucial elements such as the context and style from previously encountered sentences. In this work, we emphasize the significance of these fundamental aspects in producing pertinent and captivating translations. We demonstrate their significance through several examples and propose a novel framework for entertainment translation, which, to our knowledge, is the first of its kind. Furthermore, we introduce an algorithm to estimate the context and style of the current session and use these estimations to generate a prompt that guides a Large Language Model (LLM) to generate high-quality translations. Our method is both language and LLM-agnostic, making it a general-purpose tool. We demonstrate the effectiveness of our algorithm through various numerical studies and observe significant improvement in the COMET scores over various state-of-the-art LLMs. Moreover, our proposed method consistently outperforms baseline LLMs in terms of win-ratio.

View on arXiv PDF

Similar