Learning Non-Discriminatory Predictors
This addresses fairness in machine learning for applications where bias mitigation is critical, though it is incremental as it builds on existing definitions.
The paper tackles the problem of learning predictors that are non-discriminatory with respect to protected attributes under the equalized odds criterion, showing that a post-hoc correction method is suboptimal and proposing a nearly-optimal statistical procedure and a tractable relaxation.
We consider learning a predictor which is non-discriminatory with respect to a "protected attribute" according to the notion of "equalized odds" proposed by Hardt et al. [2016]. We study the problem of learning such a non-discriminatory predictor from a finite training set, both statistically and computationally. We show that a post-hoc correction approach, as suggested by Hardt et al, can be highly suboptimal, present a nearly-optimal statistical procedure, argue that the associated computational problem is intractable, and suggest a second moment relaxation of the non-discrimination definition for which learning is tractable.