ML AI HC LGMay 10, 2022

Don't Throw it Away! The Utility of Unlabeled Data in Fair Decision Making

Miriam Rateike, Ayan Majumdar, Olga Mineeva, Krishna P. Gummadi, Isabel Valera

arXiv:2205.04790v314.517 citationsh-index: 71Has Code

Originality Incremental advance

AI Analysis

This addresses fairness issues in decision-making algorithms for applications like hiring or lending, though it is incremental as it builds on existing online fairness methods.

The paper tackles the problem of fair decision-making with biased and selectively labeled data by proposing a method that leverages both labeled and unlabeled data to learn unbiased representations, resulting in more stable policies with higher fairness and utility compared to previous approaches.

Decision making algorithms, in practice, are often trained on data that exhibits a variety of biases. Decision-makers often aim to take decisions based on some ground-truth target that is assumed or expected to be unbiased, i.e., equally distributed across socially salient groups. In many practical settings, the ground-truth cannot be directly observed, and instead, we have to rely on a biased proxy measure of the ground-truth, i.e., biased labels, in the data. In addition, data is often selectively labeled, i.e., even the biased labels are only observed for a small fraction of the data that received a positive decision. To overcome label and selection biases, recent work proposes to learn stochastic, exploring decision policies via i) online training of new policies at each time-step and ii) enforcing fairness as a constraint on performance. However, the existing approach uses only labeled data, disregarding a large amount of unlabeled data, and thereby suffers from high instability and variance in the learned decision policies at different times. In this paper, we propose a novel method based on a variational autoencoder for practical fair decision-making. Our method learns an unbiased data representation leveraging both labeled and unlabeled data and uses the representations to learn a policy in an online process. Using synthetic data, we empirically validate that our method converges to the optimal (fair) policy according to the ground-truth with low variance. In real-world experiments, we further show that our training approach not only offers a more stable learning process but also yields policies with higher fairness as well as utility than previous approaches.

View on arXiv PDF Code

Similar