LGAICYJun 30, 2023

On the Cause of Unfairness: A Training Sample Perspective

arXiv:2306.17828v2h-index: 15
Originality Incremental advance
AI Analysis

This addresses the problem of model unfairness for practitioners by providing tools to diagnose and repair training data, though it is incremental as it builds on existing fairness analysis methods.

The paper tackles the problem of identifying causes of model unfairness by analyzing training data, quantifying how changes in samples affect unfairness through counterfactual modifications of features, labels, and sensitive attributes. The result is a framework that helps understand and mitigate unfairness, with applications such as detecting mislabeling and poisoning attacks.

Identifying the causes of a model's unfairness is an important yet relatively unexplored task. We look into this problem through the lens of training data - the major source of unfairness. We ask the following questions: How would the unfairness of a model change if its training samples (1) were collected from a different (e.g. demographic) group, (2) were labeled differently, or (3) whose features were modified? In other words, we quantify the influence of training samples on unfairness by counterfactually changing samples based on predefined concepts, i.e. data attributes such as features, labels, and sensitive attributes. Our framework not only can help practitioners understand the observed unfairness and mitigate it by repairing their training data, but also leads to many other applications, e.g. detecting mislabeling, fixing imbalanced representations, and detecting fairness-targeted poisoning attacks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes