MLLGMar 24

REALITrees: Rashomon Ensemble Active Learning for Interpretable Trees

arXiv:2603.2275034.31 citationsh-index: 5
Predicted impact top 50% in ML · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses active learning for reducing labeling costs, offering a novel approach that is incremental in improving upon existing Query-by-Committee methods.

The paper tackles the problem of active learning by proposing a method that constructs a committee from all near-optimal models (the Rashomon Set) to improve sample selection, outperforming randomized ensembles with faster convergence in moderately noisy environments.

Active learning reduces labeling costs by selecting samples that maximize information gain. A dominant framework, Query-by-Committee (QBC), typically relies on perturbation-based diversity by inducing model disagreement through random feature subsetting or data blinding. While this approximates one notion of epistemic uncertainty, it sacrifices direct characterization of the plausible hypothesis space. We propose the complementary approach: Rashomon Ensembled Active Learning (REAL) which constructs a committee by exhaustively enumerating the Rashomon Set of all near-optimal models. To address functional redundancy within this set, we adopt a PAC-Bayesian framework using a Gibbs posterior to weight committee members by their empirical risk. Leveraging recent algorithmic advances, we exactly enumerate this set for the class of sparse decision trees. Across synthetic and established active learning baselines, REAL outperforms randomized ensembles, particularly in moderately noisy environments where it strategically leverages expanded model multiplicity to achieve faster convergence.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes