CLDec 26, 2024

GFG -- Gender-Fair Generation: A CALAMITA Challenge

arXiv:2412.19168v24 citationsh-index: 34CLICIT
Originality Synthesis-oriented
AI Analysis

This addresses the problem of gender inequality in written communication for Italian speakers, but it is incremental as it builds on existing datasets and methods for language processing.

The paper tackles the challenge of promoting gender-fair language in heavily gender-marked languages like Italian by introducing the Gender-Fair Generation challenge, which includes tasks for detecting gendered expressions, reformulating them into gender-fair alternatives, and generating gender-fair translations, evaluated using metrics such as F1-score and accuracy.

Gender-fair language aims at promoting gender equality by using terms and expressions that include all identities and avoid reinforcing gender stereotypes. Implementing gender-fair strategies is particularly challenging in heavily gender-marked languages, such as Italian. To address this, the Gender-Fair Generation challenge intends to help shift toward gender-fair language in written communication. The challenge, designed to assess and monitor the recognition and generation of gender-fair language in both mono- and cross-lingual scenarios, includes three tasks: (1) the detection of gendered expressions in Italian sentences, (2) the reformulation of gendered expressions into gender-fair alternatives, and (3) the generation of gender-fair language in automatic translation from English to Italian. The challenge relies on three different annotated datasets: the GFL-it corpus, which contains Italian texts extracted from administrative documents provided by the University of Brescia; GeNTE, a bilingual test set for gender-neutral rewriting and translation built upon a subset of the Europarl dataset; and Neo-GATE, a bilingual test set designed to assess the use of non-binary neomorphemes in Italian for both fair formulation and translation tasks. Finally, each task is evaluated with specific metrics: average of F1-score obtained by means of BERTScore computed on each entry of the datasets for task 1, an accuracy measured with a gender-neutral classifier, and a coverage-weighted accuracy for tasks 2 and 3.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes