CLApr 29, 2020

Automatically Identifying Gender Issues in Machine Translation using Perturbations

arXiv:2004.14065v231.31009 citations

Originality Incremental advance

AI Analysis

This addresses gender bias in deployed machine translation systems, which is a critical fairness issue for users, though it is incremental as it builds on prior synthetic studies.

The authors tackled the problem of identifying gender issues in machine translation by developing a novel technique to mine examples from real-world data, resulting in a publicly released evaluation benchmark for four languages that exposes gendered model representations and their unintended consequences.

The successful application of neural methods to machine translation has realized huge quality advances for the community. With these improvements, many have noted outstanding challenges, including the modeling and treatment of gendered language. While previous studies have identified issues using synthetic examples, we develop a novel technique to mine examples from real world data to explore challenges for deployed systems. We use our method to compile an evaluation benchmark spanning examples for four languages from three language families, which we publicly release to facilitate research. The examples in our benchmark expose where model representations are gendered, and the unintended consequences these gendered representations can have in downstream application.

View on arXiv PDF

Similar