CLSep 7, 2021

Hi, my name is Martha: Using names to measure and mitigate bias in generative dialogue models

arXiv:2109.03300v131 citations
Originality Incremental advance
AI Analysis

This addresses bias in AI dialogue systems, which can perpetuate stereotypes and affect user interactions, though it is incremental as it builds on existing bias mitigation techniques.

The paper tackled bias in generative dialogue models by measuring differences in conversation distributions based on demographic-associated names, finding larger models exhibit more gender bias and stereotyping. It demonstrated that methods like name scrambling, controlled generation, and unlikelihood training effectively reduce bias, including in downstream tasks.

All AI models are susceptible to learning biases in data that they are trained on. For generative dialogue models, being trained on real human conversations containing unbalanced gender and race/ethnicity references can lead to models that display learned biases, which we define here broadly as any measurable differences in the distributions of words or semantic content of conversations based on demographic groups. We measure the strength of such biases by producing artificial conversations between two copies of a dialogue model, conditioning one conversational partner to state a name commonly associated with a certain gender and/or race/ethnicity. We find that larger capacity models tend to exhibit more gender bias and greater stereotyping of occupations by gender. We show that several methods of tuning these dialogue models, specifically name scrambling, controlled generation, and unlikelihood training, are effective in reducing bias in conversation, including on a downstream conversational task. Name scrambling is also effective in lowering differences in token usage across conversations where partners have names associated with different genders or races/ethnicities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes