CLJun 7, 2023

Gender, names and other mysteries: Towards the ambiguous for gender-inclusive translation

arXiv:2306.04573v1213 citationsh-index: 12
Originality Incremental advance
AI Analysis

This addresses gender bias in machine translation for users of languages with richer grammatical gender, though it is incremental as it builds on existing bias research.

The paper tackles the problem of gender-inclusive machine translation for inputs lacking explicit gender markers, such as those with person names, and finds that gender-ambiguous examples constitute a large proportion of training data, highlighting challenges in current approaches.

The vast majority of work on gender in MT focuses on 'unambiguous' inputs, where gender markers in the source language are expected to be resolved in the output. Conversely, this paper explores the widespread case where the source sentence lacks explicit gender markers, but the target sentence contains them due to richer grammatical gender. We particularly focus on inputs containing person names. Investigating such sentence pairs casts a new light on research into MT gender bias and its mitigation. We find that many name-gender co-occurrences in MT data are not resolvable with 'unambiguous gender' in the source language, and that gender-ambiguous examples can make up a large proportion of training examples. From this, we discuss potential steps toward gender-inclusive translation which accepts the ambiguity in both gender and translation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes