CLOct 30, 2019

How does Grammatical Gender Affect Noun Representations in Gender-Marking Languages?

arXiv:1910.14161v11101 citations
Originality Incremental advance
AI Analysis

This addresses bias issues in NLP for languages with grammatical gender, though it is incremental as it builds on existing debiasing methods.

The study tackled the problem of grammatical gender bias affecting word representations in gender-marking languages, showing that inanimate nouns with the same gender are closer in embedding space, and demonstrated that neutralizing grammatical gender signals during training improves embedding quality, with gains in monolingual and cross-lingual settings.

Many natural languages assign grammatical gender also to inanimate nouns in the language. In such languages, words that relate to the gender-marked nouns are inflected to agree with the noun's gender. We show that this affects the word representations of inanimate nouns, resulting in nouns with the same gender being closer to each other than nouns with different gender. While "embedding debiasing" methods fail to remove the effect, we demonstrate that a careful application of methods that neutralize grammatical gender signals from the words' context when training word embeddings is effective in removing it. Fixing the grammatical gender bias yields a positive effect on the quality of the resulting word embeddings, both in monolingual and cross-lingual settings. We note that successfully removing gender signals, while achievable, is not trivial to do and that a language-specific morphological analyzer, together with careful usage of it, are essential for achieving good results.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes