Ensemble Validation: Selectivity has a Price, but Variety is Free
This provides theoretical insights for machine learning practitioners on ensemble methods, but it is incremental as it builds on existing error bound analysis without introducing new practical algorithms.
The paper tackles the problem of error bounds for ensemble classifiers that randomly select a member for each input, showing that the ensemble's error bound includes the average error of members plus a selectivity term that ranges from zero to a standard uniform bound, with no penalty for using a richer hypothesis set if the selection fraction is constant.
Suppose some classifiers are selected from a set of hypothesis classifiers to form an equally-weighted ensemble that selects a member classifier at random for each input example. Then the ensemble has an error bound consisting of the average error bound for the member classifiers, a term for selectivity that varies from zero (if all hypothesis classifiers are selected) to a standard uniform error bound (if only a single classifier is selected), and small constants. There is no penalty for using a richer hypothesis set if the same fraction of the hypothesis classifiers are selected for the ensemble.