LGMLAug 3, 2012

On the Consistency of AUC Pairwise Optimization

arXiv:1208.0645v4136 citations
AI Analysis

This work addresses a foundational gap in machine learning by providing theoretical guarantees for AUC optimization, which is crucial for tasks like class-imbalance learning, but it is incremental as it builds on existing surrogate loss frameworks.

The paper tackles the problem of ensuring that surrogate loss functions used to optimize AUC are asymptotically consistent, proving that exponential and logistic losses are consistent while hinge loss is not, and deriving new consistent loss functions and bounds.

AUC (area under ROC curve) is an important evaluation criterion, which has been popularly used in many learning tasks such as class-imbalance learning, cost-sensitive learning, learning to rank, etc. Many learning approaches try to optimize AUC, while owing to the non-convexity and discontinuousness of AUC, almost all approaches work with surrogate loss functions. Thus, the consistency of AUC is crucial; however, it has been almost untouched before. In this paper, we provide a sufficient condition for the asymptotic consistency of learning approaches based on surrogate loss functions. Based on this result, we prove that exponential loss and logistic loss are consistent with AUC, but hinge loss is inconsistent. Then, we derive the $q$-norm hinge loss and general hinge loss that are consistent with AUC. We also derive the consistent bounds for exponential loss and logistic loss, and obtain the consistent bounds for many surrogate loss functions under the non-noise setting. Further, we disclose an equivalence between the exponential surrogate loss of AUC and exponential surrogate loss of accuracy, and one straightforward consequence of such finding is that AdaBoost and RankBoost are equivalent.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes