LGAIAug 15, 2022

Combining Predictions under Uncertainty: The Case of Random Decision Trees

arXiv:2208.07403v11 citationsh-index: 41
Originality Synthesis-oriented
AI Analysis

This work addresses an incremental improvement in ensemble methods for classification, relevant to researchers and practitioners in machine learning.

The paper tackled the problem of how to best combine probabilistic estimates from multiple sources, specifically in ensembles of random decision trees, and found that averaging probabilities is hard to beat, but evidence accumulation consistently performed better except on very small leafs.

A common approach to aggregate classification estimates in an ensemble of decision trees is to either use voting or to average the probabilities for each class. The latter takes uncertainty into account, but not the reliability of the uncertainty estimates (so to say, the "uncertainty about the uncertainty"). More generally, much remains unknown about how to best combine probabilistic estimates from multiple sources. In this paper, we investigate a number of alternative prediction methods. Our methods are inspired by the theories of probability, belief functions and reliable classification, as well as a principle that we call evidence accumulation. Our experiments on a variety of data sets are based on random decision trees which guarantees a high diversity in the predictions to be combined. Somewhat unexpectedly, we found that taking the average over the probabilities is actually hard to beat. However, evidence accumulation showed consistently better results on all but very small leafs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes