LG MLFeb 25, 2020

General Framework for Binary Classification on Top Samples

Lukáš Adam, Václav Mácha, Václav Šmídl, Tomáš Pevný

arXiv:2002.10923v13.32 citationsh-index: 31

Originality Synthesis-oriented

AI Analysis

This work addresses classification tasks where performance on top samples is critical, offering a unified approach for practitioners, though it is incremental in synthesizing existing methods.

The paper tackles the problem of binary classification focusing on minimizing misclassification above or below a threshold, unifying ranking, accuracy at the top, and hypothesis testing into a general framework. It provides a theoretical analysis, numerical improvements, and guidelines for method selection based on experiments.

Many binary classification problems minimize misclassification above (or below) a threshold. We show that instances of ranking problems, accuracy at the top or hypothesis testing may be written in this form. We propose a general framework to handle these classes of problems and show which known methods (both known and newly proposed) fall into this framework. We provide a theoretical analysis of this framework and mention selected possible pitfalls the methods may encounter. We suggest several numerical improvements including the implicit derivative and stochastic gradient descent. We provide an extensive numerical study. Based both on the theoretical properties and numerical experiments, we conclude the paper by suggesting which method should be used in which situation.

View on arXiv PDF

Similar