CLAICYLGMay 22, 2023

Should We Attend More or Less? Modulating Attention for Fairness

arXiv:2305.13088v216 citations
Originality Incremental advance
AI Analysis

This addresses fairness issues in NLP for users affected by biased models, but it is incremental as it builds on existing intra-processing techniques.

The paper tackled the problem of attention mechanisms in NLP models propagating social biases like gender stereotypes, and proposed a post-training method to modulate attention weights, resulting in increased fairness with minimal performance loss across various tasks.

The advances in natural language processing (NLP) pose both opportunities and challenges. While recent progress enables the development of high-performing models for a variety of tasks, it also poses the risk of models learning harmful biases from the data, such as gender stereotypes. In this work, we investigate the role of attention, a widely-used technique in current state-of-the-art NLP models, in the propagation of social biases. Specifically, we study the relationship between the entropy of the attention distribution and the model's performance and fairness. We then propose a novel method for modulating attention weights to improve model fairness after training. Since our method is only applied post-training and pre-inference, it is an intra-processing method and is, therefore, less computationally expensive than existing in-processing and pre-processing approaches. Our results show an increase in fairness and minimal performance loss on different text classification and generation tasks using language models of varying sizes. WARNING: This work uses language that is offensive.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes