The impossibility of "fairness": a generalized impossibility result for decisions
This is a foundational result for fairness in machine learning, affecting all predictors including algorithms and humans, by revealing inherent trade-offs in fairness definitions.
The paper tackles the problem of fairness in predictors by showing that when groups differ in event prevalence, three intuitive fairness measures are mutually exclusive, meaning any non-perfect predictor must be unfair under two out of three criteria, with this result generalizing to statistical quantities like sensitivity and specificity.
Various measures can be used to estimate bias or unfairness in a predictor. Previous work has already established that some of these measures are incompatible with each other. Here we show that, when groups differ in prevalence of the predicted event, several intuitive, reasonable measures of fairness (probability of positive prediction given occurrence or non-occurrence; probability of occurrence given prediction or non-prediction; and ratio of predictions over occurrences for each group) are all mutually exclusive: if one of them is equal among groups, the other two must differ. The only exceptions are for perfect, or trivial (always-positive or always-negative) predictors. As a consequence, any non-perfect, non-trivial predictor must necessarily be "unfair" under two out of three reasonable sets of criteria. This result readily generalizes to a wide range of well-known statistical quantities (sensitivity, specificity, false positive rate, precision, etc.), all of which can be divided into three mutually exclusive groups. Importantly, The results applies to all predictors, whether algorithmic or human. We conclude with possible ways to handle this effect when assessing and designing prediction methods.