CLCYLGFeb 11, 2023

Fair Enough: Standardizing Evaluation and Model Selection for Fairness Research in NLP

arXiv:2302.05711v1271 citationsh-index: 17
Originality Synthesis-oriented
AI Analysis

It addresses the problem of inconsistent progress in fairness research for NLP practitioners and researchers, though it is incremental as it builds on existing work.

The paper tackles the lack of standardization in evaluating and selecting models for fairness in NLP, by clarifying the relationships among debiasing methods and fairness theory, and addressing the fairness-accuracy trade-off in model selection.

Modern NLP systems exhibit a range of biases, which a growing literature on model debiasing attempts to correct. However current progress is hampered by a plurality of definitions of bias, means of quantification, and oftentimes vague relation between debiasing algorithms and theoretical measures of bias. This paper seeks to clarify the current situation and plot a course for meaningful progress in fair learning, with two key contributions: (1) making clear inter-relations among the current gamut of methods, and their relation to fairness theory; and (2) addressing the practical problem of model selection, which involves a trade-off between fairness and accuracy and has led to systemic issues in fairness research. Putting them together, we make several recommendations to help shape future work.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes