LGMLJul 1, 2020

Second Order PAC-Bayesian Bounds for the Weighted Majority Vote

arXiv:2007.13532v245 citations
AI Analysis

This work addresses the challenge of optimizing ensemble weights for practitioners in machine learning, though it is incremental as it builds on existing PAC-Bayesian bounds.

The paper tackles the problem of bounding the expected risk of weighted majority votes in multiclass classification by incorporating correlations among ensemble members, resulting in a bound that can be efficiently minimized to improve weighting without degrading test error in experiments.

We present a novel analysis of the expected risk of weighted majority vote in multiclass classification. The analysis takes correlation of predictions by ensemble members into account and provides a bound that is amenable to efficient minimization, which yields improved weighting for the majority vote. We also provide a specialized version of our bound for binary classification, which allows to exploit additional unlabeled data for tighter risk estimation. In experiments, we apply the bound to improve weighting of trees in random forests and show that, in contrast to the commonly used first order bound, minimization of the new bound typically does not lead to degradation of the test error of the ensemble.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes