Distributionally Robust Losses for Latent Covariate Mixtures
This addresses fairness and robustness issues in machine learning for applications with diverse groups, though it is incremental as it builds on existing distributionally robust optimization methods.
The paper tackles the problem of heterogeneous subpopulations in datasets by proposing a convex procedure to control worst-case performance across all subpopulations of a given size, with finite-sample convergence guarantees and empirical improvements on tasks like lexical similarity, wine quality, and recidivism prediction.
While modern large-scale datasets often consist of heterogeneous subpopulations -- for example, multiple demographic groups or multiple text corpora -- the standard practice of minimizing average loss fails to guarantee uniformly low losses across all subpopulations. We propose a convex procedure that controls the worst-case performance over all subpopulations of a given size. Our procedure comes with finite-sample (nonparametric) convergence guarantees on the worst-off subpopulation. Empirically, we observe on lexical similarity, wine quality, and recidivism prediction tasks that our worst-case procedure learns models that do well against unseen subpopulations.