CLNov 17, 2025

Translation Entropy: A Statistical Framework for Evaluating Translation Systems

arXiv:2511.13180v11 citationsh-index: 5Physica A: Statistical Mechanics and its Applications
Originality Highly original
AI Analysis

This provides an objective benchmarking tool for AI translation systems, addressing a key evaluation gap in natural language processing.

The study tackled the lack of quantitative methods for evaluating translation systems by introducing translation entropy, a statistical measure based on token replacement probabilities, and applied it to rank translators like MarianMT, T5-Base, and NLLB-200, showing symmetry and multiplicative effects in entropy.

The translation of written language has been known since the 3rd century BC; however, its necessity has become increasingly common in the information age. Today, many translators exist, based on encoder-decoder deep architectures, nevertheless, no quantitative objective methods are available to assess their performance, likely because the entropy of even a single language remains unknown. This study presents a quantitative method for estimating translation entropy, with the following key finding. Given a translator, several sentences that differ by only one selected token of a given pivot sentence yield identical translations. Analyzing the statistics of this phenomenon across an ensemble of such sentences, consisting each of a pivot selected token, yields the probabilities of replacing this specific token with others while preserving the translation. These probabilities constitute the entropy of the selected token, and the average across all selected pivot tokens provides an estimate of the translator's overall translation entropy, which is enhanced along the decoder blocks. This entropic measure allows for the quantitative ranking of several publicly available translators and reveals whether mutual translation entropy is symmetric. Extending the proposed method to include the replacement of two tokens in a given pivot sentence demonstrates a multiplicative effect, where translation degeneracy is proportional to the product of the degeneracies of the two tokens. These findings establish translation entropy as a measurable property and objective benchmarking of artificial translators. Results are based on MarianMT, T5-Base and NLLB-200 translators.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes