Mind the Inclusivity Gap: Multilingual Gender-Neutral Translation Evaluation with mGeNTE
This addresses the problem of inclusive multilingual technologies for users of languages with gendered morphology, but it is incremental as it builds on existing GNT research with a new resource and evaluation.
The paper tackled the challenge of gender-neutral translation (GNT) across multiple languages by introducing mGeNTE, an expert-curated resource, and found that state-of-the-art language models recognize when neutrality is appropriate but fail to consistently produce neutral translations, limiting usability.
Avoiding the propagation of undue (binary) gender inferences and default masculine language remains a key challenge towards inclusive multilingual technologies, particularly when translating into languages with extensive gendered morphology. Gender-neutral translation (GNT) represents a linguistic strategy towards fairer communication across languages. However, research on GNT is limited to a few resources and language pairs. To address this gap, we introduce mGeNTE, an expert-curated resource, and use it to conduct the first systematic multilingual evaluation of inclusive translation with state-of-the-art instruction-following language models (LMs). Experiments on en-es/de/it/el reveal that while models can recognize when neutrality is appropriate, they cannot consistently produce neutral translations, limiting their usability. To probe this behavior, we enrich our evaluation with interpretability analyses that identify task-relevant features and offer initial insights into the internal dynamics of LM-based GNT.