Fairness without Demographics through Adversarially Reweighted Learning
This addresses fairness in machine learning for scenarios where privacy or regulations prevent the use of protected features, offering a practical solution to a common real-world limitation.
The paper tackles the problem of training machine learning models to improve fairness without access to protected demographic features, proposing Adversarially Reweighted Learning (ARL) which uses non-protected features and labels to co-train an adversarial reweighting approach, resulting in notable AUC improvements for worst-case protected groups across multiple datasets.
Much of the previous machine learning (ML) fairness literature assumes that protected features such as race and sex are present in the dataset, and relies upon them to mitigate fairness concerns. However, in practice factors like privacy and regulation often preclude the collection of protected features, or their use for training or inference, severely limiting the applicability of traditional fairness research. Therefore we ask: How can we train an ML model to improve fairness when we do not even know the protected group memberships? In this work we address this problem by proposing Adversarially Reweighted Learning (ARL). In particular, we hypothesize that non-protected features and task labels are valuable for identifying fairness issues, and can be used to co-train an adversarial reweighting approach for improving fairness. Our results show that {ARL} improves Rawlsian Max-Min fairness, with notable AUC improvements for worst-case protected groups in multiple datasets, outperforming state-of-the-art alternatives.