CLMar 18, 2022

Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation

arXiv:2203.09866v1649 citationsh-index: 47
Originality Synthesis-oriented
AI Analysis

This work addresses gender bias in speech translation for grammatical gender languages, offering a more nuanced evaluation method, though it is incremental in building on existing corpus and annotation techniques.

The study tackled gender bias in speech translation by evaluating how it manifests across different lexical categories and agreement phenomena in grammatical gender languages, finding that dedicated analyses beyond aggregated results are valuable for understanding model behaviors and bias detection.

Gender bias is largely recognized as a problematic phenomenon affecting language technologies, with recent studies underscoring that it might surface differently across languages. However, most of current evaluation practices adopt a word-level focus on a narrow set of occupational nouns under synthetic conditions. Such protocols overlook key features of grammatical gender languages, which are characterized by morphosyntactic chains of gender agreement, marked on a variety of lexical items and parts-of-speech (POS). To overcome this limitation, we enrich the natural, gender-sensitive MuST-SHE corpus (Bentivogli et al., 2020) with two new linguistic annotation layers (POS and agreement chains), and explore to what extent different lexical categories and agreement phenomena are impacted by gender skews. Focusing on speech translation, we conduct a multifaceted evaluation on three language directions (English-French/Italian/Spanish), with models trained on varying amounts of data and different word segmentation techniques. By shedding light on model behaviours, gender bias, and its detection at several levels of granularity, our findings emphasize the value of dedicated analyses beyond aggregated overall results.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes