Learning Acceptance Regions for Many Classes with Anomaly Detection
This work addresses the problem of set-valued classification for scenarios with many classes and potential new classes, offering an incremental improvement over existing methods.
The paper tackles set-valued classification by proposing a Generalized Prediction Set (GPS) approach that estimates acceptance regions while handling new classes in test data and reducing computational cost for many classes, achieving a balance between accuracy, efficiency, and anomaly detection rate as validated by theoretical and numerical results.
Set-valued classification, a new classification paradigm that aims to identify all the plausible classes that an observation belongs to, can be obtained by learning the acceptance regions for all classes. Many existing set-valued classification methods do not consider the possibility that a new class that never appeared in the training data appears in the test data. Moreover, they are computationally expensive when the number of classes is large. We propose a Generalized Prediction Set (GPS) approach to estimate the acceptance regions while considering the possibility of a new class in the test data. The proposed classifier minimizes the expected size of the prediction set while guaranteeing that the class-specific accuracy is at least a pre-specified value. Unlike previous methods, the proposed method achieves a good balance between accuracy, efficiency, and anomaly detection rate. Moreover, our method can be applied in parallel to all the classes to alleviate the computational burden. Both theoretical analysis and numerical experiments are conducted to illustrate the effectiveness of the proposed method.