LGMLJan 16, 2020

Better Boosting with Bandits for Online Learning

arXiv:2001.06105v11 citations
AI Analysis

This work addresses calibration issues in online boosting for machine learning practitioners, though it is incremental as it adapts existing bandit optimization to a specific problem.

The paper tackled the problem of poorly calibrated probability estimates in online boosting ensembles by introducing a bandit-based method to decide whether new examples should update the ensemble or calibrator, resulting in superior probability estimation performance compared to uncalibrated and naively-calibrated methods.

Probability estimates generated by boosting ensembles are poorly calibrated because of the margin maximization nature of the algorithm. The outputs of the ensemble need to be properly calibrated before they can be used as probability estimates. In this work, we demonstrate that online boosting is also prone to producing distorted probability estimates. In batch learning, calibration is achieved by reserving part of the training data for training the calibrator function. In the online setting, a decision needs to be made on each round: shall the new example(s) be used to update the parameters of the ensemble or those of the calibrator. We proceed to resolve this decision with the aid of bandit optimization algorithms. We demonstrate superior performance to uncalibrated and naively-calibrated on-line boosting ensembles in terms of probability estimation. Our proposed mechanism can be easily adapted to other tasks(e.g. cost-sensitive classification) and is robust to the choice of hyperparameters of both the calibrator and the ensemble.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes