A Distributionally Robust Area Under Curve Maximization Model
This work addresses robustness in classification for machine learning practitioners, but it is incremental as it builds on existing AUC maximization with distributional robustness.
The authors tackled the problem of improving classification model robustness by proposing distributionally robust AUC maximization models that enhance worst-case out-of-sample performance, showing better results than standard methods on most datasets, especially with small training sets.
Area under ROC curve (AUC) is a widely used performance measure for classification models. We propose two new distributionally robust AUC maximization models (DR-AUC) that rely on the Kantorovich metric and approximate the AUC with the hinge loss function. We consider the two cases with respectively fixed and variable support for the worst-case distribution. We use duality theory to reformulate the DR-AUC models and derive tractable convex optimization problems. The numerical experiments show that the proposed DR-AUC models -- benchmarked with the standard deterministic AUC and the support vector machine models - perform better in general and in particular improve the worst-case out-of-sample performance over the majority of the considered datasets, thereby showing their robustness. The results are particularly encouraging since our numerical experiments are conducted with training sets of small size which have been known to be conducive to low out-of-sample performance.