Fair Normalizing Flows
This addresses fairness in machine learning for sensitive data, offering a more robust solution against adversarial attacks, though it is incremental by building on existing fair representation learning methods.
The paper tackles the problem of adversarial predictors recovering sensitive attributes from fair representations by introducing Fair Normalizing Flows (FNF), which uses normalizing flows to minimize statistical distance between group latents, resulting in rigorous fairness guarantees for downstream predictors.
Fair representation learning is an attractive approach that promises fairness of downstream predictors by encoding sensitive data. Unfortunately, recent work has shown that strong adversarial predictors can still exhibit unfairness by recovering sensitive attributes from these representations. In this work, we present Fair Normalizing Flows (FNF), a new approach offering more rigorous fairness guarantees for learned representations. Specifically, we consider a practical setting where we can estimate the probability density for sensitive groups. The key idea is to model the encoder as a normalizing flow trained to minimize the statistical distance between the latent representations of different groups. The main advantage of FNF is that its exact likelihood computation allows us to obtain guarantees on the maximum unfairness of any potentially adversarial downstream predictor. We experimentally demonstrate the effectiveness of FNF in enforcing various group fairness notions, as well as other attractive properties such as interpretability and transfer learning, on a variety of challenging real-world datasets.