FlipTest: Fairness Testing via Optimal Transport
This provides a practical tool for uncovering hidden biases in AI systems, which is crucial for ensuring fairness in applications like hiring or lending, though it is an incremental improvement on existing testing methods.
FlipTest addresses the problem of detecting discrimination in classifiers by using optimal transport to match individuals across protected groups and identify those whose classification changes, revealing potential harm even when group fairness criteria are met, as demonstrated in three case studies.
We present FlipTest, a black-box technique for uncovering discrimination in classifiers. FlipTest is motivated by the intuitive question: had an individual been of a different protected status, would the model have treated them differently? Rather than relying on causal information to answer this question, FlipTest leverages optimal transport to match individuals in different protected groups, creating similar pairs of in-distribution samples. We show how to use these instances to detect discrimination by constructing a "flipset": the set of individuals whose classifier output changes post-translation, which corresponds to the set of people who may be harmed because of their group membership. To shed light on why the model treats a given subgroup differently, FlipTest produces a "transparency report": a ranking of features that are most associated with the model's behavior on the flipset. Evaluating the approach on three case studies, we show that this provides a computationally inexpensive way to identify subgroups that may be harmed by model discrimination, including in cases where the model satisfies group fairness criteria.