LG MLMay 23, 2019

Binary Classification with Bounded Abstention Rate

Shubhanshu Shekhar, Mohammad Ghavamzadeh, Tara Javidi

arXiv:1905.09561v14.87 citations

Originality Incremental advance

AI Analysis

This work addresses classification with controlled abstention for scenarios requiring reliability, but it is incremental as it builds upon prior theoretical results and methods.

The paper tackles binary classification with abstention under a bounded-rate constraint, deriving the Bayes optimal classifier and proposing a plug-in classifier with theoretical risk bounds and a computationally efficient algorithm for high dimensions, achieving minimax near-optimality and empirical validation on UCI datasets.

We consider the problem of binary classification with abstention in the relatively less studied \emph{bounded-rate} setting. We begin by obtaining a characterization of the Bayes optimal classifier for an arbitrary input-label distribution $P_{XY}$. Our result generalizes and provides an alternative proof for the result first obtained by \cite{chow1957optimum}, and then re-derived by \citet{denis2015consistency}, under a continuity assumption on $P_{XY}$. We then propose a plug-in classifier that employs unlabeled samples to decide the region of abstention and derive an upper-bound on the excess risk of our classifier under standard \emph{Hölder smoothness} and \emph{margin} assumptions. Unlike the plug-in rule of \citet{denis2015consistency}, our constructed classifier satisfies the abstention constraint with high probability and can also deal with discontinuities in the empirical cdf. We also derive lower-bounds that demonstrate the minimax near-optimality of our proposed algorithm. To address the excessive complexity of the plug-in classifier in high dimensions, we propose a computationally efficient algorithm that builds upon prior work on convex loss surrogates, and obtain bounds on its excess risk in the \emph{realizable} case. We empirically compare the performance of the proposed algorithm with a baseline on a number of UCI benchmark datasets.

View on arXiv PDF

Similar