Systematic Evaluation of Predictive Fairness
This work addresses the need for more systematic evaluation in fairness research, which is crucial for researchers and practitioners developing fair AI systems, though it is incremental in highlighting evaluation gaps rather than proposing new methods.
The study tackled the problem of evaluating debiasing methods for predictive fairness by examining their performance across diverse data conditions, such as class imbalance and stereotyping, and found that data conditions strongly influence relative model performance, making general conclusions about method efficacy unreliable when based only on standard datasets.
Mitigating bias in training on biased datasets is an important open problem. Several techniques have been proposed, however the typical evaluation regime is very limited, considering very narrow data conditions. For instance, the effect of target class imbalance and stereotyping is under-studied. To address this gap, we examine the performance of various debiasing methods across multiple tasks, spanning binary classification (Twitter sentiment), multi-class classification (profession prediction), and regression (valence prediction). Through extensive experimentation, we find that data conditions have a strong influence on relative model performance, and that general conclusions cannot be drawn about method efficacy when evaluating only on standard datasets, as is current practice in fairness research.