Efficient Online Bootstrapping for Large Scale Learning
This work addresses the problem of scaling bootstrapping to large datasets for practitioners, though it is incremental as it builds on existing bootstrapping techniques.
The authors tackled the high computational cost of bootstrapping for uncertainty estimation by developing a highly scalable online bootstrapping strategy, which is several times faster than traditional methods and may improve prediction performance through model averaging.
Bootstrapping is a useful technique for estimating the uncertainty of a predictor, for example, confidence intervals for prediction. It is typically used on small to moderate sized datasets, due to its high computation cost. This work describes a highly scalable online bootstrapping strategy, implemented inside Vowpal Wabbit, that is several times faster than traditional strategies. Our experiments indicate that, in addition to providing a black box-like method for estimating uncertainty, our implementation of online bootstrapping may also help to train models with better prediction performance due to model averaging.