Distribution-Free Rates in Neyman-Pearson Classification
This provides theoretical insights for unbalanced classification problems, but it is incremental as it builds on existing Neyman-Pearson frameworks.
The paper tackles the problem of Neyman-Pearson classification in unbalanced settings by characterizing distribution-free minimax rates over all pairs of distributions, identifying a dichotomy between hard and easy classifier classes based on a geometric condition related to VC dimension.
We consider the problem of Neyman-Pearson classification which models unbalanced classification settings where error w.r.t. a distribution $μ_1$ is to be minimized subject to low error w.r.t. a different distribution $μ_0$. Given a fixed VC class $\mathcal{H}$ of classifiers to be minimized over, we provide a full characterization of possible distribution-free rates, i.e., minimax rates over the space of all pairs $(μ_0, μ_1)$. The rates involve a dichotomy between hard and easy classes $\mathcal{H}$ as characterized by a simple geometric condition, a three-points-separation condition, loosely related to VC dimension.