Fair Representation Learning with Controllable High Confidence Guarantees via Adversarial Inference
This work addresses fairness guarantees in representation learning for preventing unfairness toward demographic groups in downstream tasks, representing an incremental improvement with a focus on high-confidence controls.
The paper tackles the problem of ensuring fairness in representation learning by introducing a framework that guarantees demographic disparity in downstream predictions is bounded by a user-defined error threshold with controllable high probability. The results show that FRG consistently bounds unfairness across multiple datasets and downstream models compared to six state-of-the-art methods.
Representation learning is increasingly applied to generate representations that generalize well across multiple downstream tasks. Ensuring fairness guarantees in representation learning is crucial to prevent unfairness toward specific demographic groups in downstream tasks. In this work, we formally introduce the task of learning representations that achieve high-confidence fairness. We aim to guarantee that demographic disparity in every downstream prediction remains bounded by a *user-defined* error threshold $ε$, with *controllable* high probability. To this end, we propose the ***F**air **R**epresentation learning with high-confidence **G**uarantees (FRG)* framework, which provides these high-confidence fairness guarantees by leveraging an optimized adversarial model. We empirically evaluate FRG on three real-world datasets, comparing its performance to six state-of-the-art fair representation learning methods. Our results demonstrate that FRG consistently bounds unfairness across a range of downstream models and tasks.