How to Control the Error Rates of Binary Classifiers
This work solves the problem of managing error rates in binary classification for users who require specific error thresholds, representing an incremental improvement by applying existing statistical techniques to a known bottleneck.
The study addresses the lack of control over false positive and false negative error rates in binary classifiers by integrating statistical hypothesis testing to limit these rates to predefined thresholds, demonstrating a method to calculate classification p-values for this purpose.
The traditional binary classification framework constructs classifiers which may have good accuracy, but whose false positive and false negative error rates are not under users' control. In many cases, one of the errors is more severe and only the classifiers with the corresponding rate lower than the predefined threshold are acceptable. In this study, we combine binary classification with statistical hypothesis testing to control the target error rate of already trained classifiers. In particular, we show how to turn binary classifiers into statistical tests, calculate the classification p-values, and use them to limit the target error rate.