Estimating the Arc Length of the Optimal ROC Curve and Lower Bounding the Maximal AUC
This work addresses imbalanced binary classification by providing a novel method to estimate and optimize AUC, though it appears incremental as it builds on existing variational and f-divergence techniques.
The paper tackles the problem of estimating the arc length of the optimal ROC curve and lower bounding the maximal AUC, showing that the arc length is an f-divergence and deriving a non-parametric estimator with convergence rate O_p(n^{-β/4}). Experiments on CIFAR-10 demonstrate good AUC performance in imbalanced binary classification.
In this paper, we show the arc length of the optimal ROC curve is an $f$-divergence. By leveraging this result, we express the arc length using a variational objective and estimate it accurately using positive and negative samples. We show this estimator has a non-parametric convergence rate $O_p(n^{-β/4})$ ($β\in (0,1]$ depends on the smoothness). Using the same technique, we show the surface area between the optimal ROC curve and the diagonal can be expressed via a similar variational objective. These new insights lead to a novel classification procedure that maximizes an approximate lower bound of the maximal AUC. Experiments on CIFAR-10 datasets show the proposed two-step procedure achieves good AUC performance in imbalanced binary classification tasks.