CLAIApr 30, 2024

Transforming Dutch: Debiasing Dutch Coreference Resolution Systems for Non-binary Pronouns

arXiv:2405.00134v13 citationsh-index: 2FAccT
Originality Incremental advance
AI Analysis

This addresses the risk of misgendering non-binary individuals in Dutch NLP systems, offering a practical debiasing solution, though it is incremental as it builds on existing techniques like CDA.

The paper tackled the problem of Dutch coreference resolution systems performing poorly on gender-neutral pronouns like 'hen' and 'die', showing that Counterfactual Data Augmentation (CDA) substantially reduces the performance gap between gendered and gender-neutral pronouns, with efficacy in low-resource settings and on unseen neopronouns.

Gender-neutral pronouns are increasingly being introduced across Western languages. Recent evaluations have however demonstrated that English NLP systems are unable to correctly process gender-neutral pronouns, with the risk of erasing and misgendering non-binary individuals. This paper examines a Dutch coreference resolution system's performance on gender-neutral pronouns, specifically hen and die. In Dutch, these pronouns were only introduced in 2016, compared to the longstanding existence of singular they in English. We additionally compare two debiasing techniques for coreference resolution systems in non-binary contexts: Counterfactual Data Augmentation (CDA) and delexicalisation. Moreover, because pronoun performance can be hard to interpret from a general evaluation metric like LEA, we introduce an innovative evaluation metric, the pronoun score, which directly represents the portion of correctly processed pronouns. Our results reveal diminished performance on gender-neutral pronouns compared to gendered counterparts. Nevertheless, although delexicalisation fails to yield improvements, CDA substantially reduces the performance gap between gendered and gender-neutral pronouns. We further show that CDA remains effective in low-resource settings, in which a limited set of debiasing documents is used. This efficacy extends to previously unseen neopronouns, which are currently infrequently used but may gain popularity in the future, underscoring the viability of effective debiasing with minimal resources and low computational costs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes