LG AI MLSep 4, 2022

ProBoost: a Boosting Method for Probabilistic Classifiers

Fábio Mendonça, Sheikh Shanawaz Mostafa, Fernando Morgado-Dias, Antonio G. Ravelo-García, Mário A. T. Figueiredo

arXiv:2209.01611v11.8h-index: 20

Originality Incremental advance

AI Analysis

This work addresses performance enhancement for probabilistic classifiers, particularly in computer vision, but appears incremental as it builds on existing boosting and uncertainty estimation techniques.

The authors tackled the problem of improving probabilistic classifiers by proposing ProBoost, a boosting algorithm that uses epistemic uncertainty to identify challenging training samples and adjust their importance for subsequent weak learners. Experimental results on MNIST datasets showed that ProBoost with just four weak learners achieved over 12% improvement in relative achievable improvement metrics compared to models without it.

ProBoost, a new boosting algorithm for probabilistic classifiers, is proposed in this work. This algorithm uses the epistemic uncertainty of each training sample to determine the most challenging/uncertain ones; the relevance of these samples is then increased for the next weak learner, producing a sequence that progressively focuses on the samples found to have the highest uncertainty. In the end, the weak learners' outputs are combined into a weighted ensemble of classifiers. Three methods are proposed to manipulate the training set: undersampling, oversampling, and weighting the training samples according to the uncertainty estimated by the weak learners. Furthermore, two approaches are studied regarding the ensemble combination. The weak learner herein considered is a standard convolutional neural network, and the probabilistic models underlying the uncertainty estimation use either variational inference or Monte Carlo dropout. The experimental evaluation carried out on MNIST benchmark datasets shows that ProBoost yields a significant performance improvement. The results are further highlighted by assessing the relative achievable improvement, a metric proposed in this work, which shows that a model with only four weak learners leads to an improvement exceeding 12% in this metric (for either accuracy, sensitivity, or specificity), in comparison to the model learned without ProBoost.

View on arXiv PDF

Similar