CLJul 24, 2025

Uncertainty Quantification for Evaluating Machine Translation Bias

Ieva Raminta Staliūnaitė, Julius Cheng, Andreas Vlachos

arXiv:2507.18338v24.92 citationsh-index: 9

Originality Incremental advance

AI Analysis

This work addresses gender bias evaluation in machine translation for ambiguous inputs, which is an incremental improvement over prior methods limited to unambiguous cases.

The paper tackled the problem of measuring gender bias in machine translation systems, particularly for ambiguous source sentences, and found that high translation accuracy does not correlate with appropriate uncertainty handling and that debiasing methods affect ambiguous and unambiguous cases differently.

The predictive uncertainty of machine translation (MT) models is typically used as a quality estimation proxy. In this work, we posit that apart from confidently translating when a single correct translation exists, models should also maintain uncertainty when the input is ambiguous. We use uncertainty to measure gender bias in MT systems. When the source sentence includes a lexeme whose gender is not overtly marked, but whose target-language equivalent requires gender specification, the model must infer the appropriate gender from the context and can be susceptible to biases. Prior work measured bias via gender accuracy, however it cannot be applied to ambiguous cases. Using semantic uncertainty, we are able to assess bias when translating both ambiguous and unambiguous source sentences, and find that high translation accuracy does not correlate with exhibiting uncertainty appropriately, and that debiasing affects the two cases differently.

View on arXiv PDF

Similar