Calibrated simplex-mapping classification
This work addresses the need for calibrated classifiers in applications requiring confidence estimates, though it appears incremental as it builds on existing simplex and regression techniques.
The authors tackled multi-class classification by proposing a two-step method that maps data to a simplex-based latent space and then extends it via regression, resulting in a well-calibrated classifier with demonstrated performance on synthetic and real-world datasets.
We propose a novel methodology for general multi-class classification in arbitrary feature spaces, which results in a potentially well-calibrated classifier. Calibrated classifiers are important in many applications because, in addition to the prediction of mere class labels, they also yield a confidence level for each of their predictions. In essence, the training of our classifier proceeds in two steps. In a first step, the training data is represented in a latent space whose geometry is induced by a regular $(n-1)$-dimensional simplex, $n$ being the number of classes. We design this representation in such a way that it well reflects the feature space distances of the datapoints to their own- and foreign-class neighbors. In a second step, the latent space representation of the training data is extended to the whole feature space by fitting a regression model to the transformed data. With this latent-space representation, our calibrated classifier is readily defined. We rigorously establish its core theoretical properties and benchmark its prediction and calibration properties by means of various synthetic and real-world data sets from different application domains.