CLApr 16

Fabricator or dynamic translator?

arXiv:2604.1516561.7

Predicted impact top 16% in CL · last 90 daysOriginality Synthesis-oriented

AI Analysis

For practitioners deploying LLMs for translation, this work provides a taxonomy and detection strategies for overgenerations, though it is incremental and lacks concrete performance numbers.

The paper investigates overgenerations in LLM-based machine translation, categorizing them into self-explanations, risky confabulations, and appropriate explanations. It presents strategies and results from a commercial setting for detecting and classifying these overgenerations.

LLMs are proving to be adept at machine translation although due to their generative nature they may at times overgenerate in various ways. These overgenerations are different from the neurobabble seen in NMT and range from LLM self-explanations, to risky confabulations, to appropriate explanations, where the LLM is able to act as a human translator would, enabling greater comprehension for the target audience. Detecting and determining the exact nature of the overgenerations is a challenging task. We detail different strategies we have explored for our work in a commercial setting, and present our results.

View on arXiv PDF

Similar