LGAIMLFeb 10, 2024

Tree Ensembles for Contextual Bandits

arXiv:2402.06963v32 citationsh-index: 16Trans. Mach. Learn. Res.
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient decision-making in contextual bandits for applications like navigation, though it appears incremental as it builds on existing bandit methods with tree ensembles.

The paper tackles the problem of contextual multi-armed bandits by proposing a framework based on tree ensembles, adapting methods like Upper Confidence Bound and Thompson Sampling. It demonstrates superior performance in regret minimization and computational runtime compared to state-of-the-art methods on benchmark datasets and a real-world navigation application.

We propose a new framework for contextual multi-armed bandits based on tree ensembles. Our framework adapts two widely used bandit methods, Upper Confidence Bound and Thompson Sampling, for both standard and combinatorial settings. As part of this framework, we propose a novel method of estimating the uncertainty in tree ensemble predictions. We further demonstrate the effectiveness of our framework via several experimental studies, employing XGBoost and random forests, two popular tree ensemble methods. Compared to state-of-the-art methods based on decision trees and neural networks, our methods exhibit superior performance in terms of both regret minimization and computational runtime, when applied to benchmark datasets and the real-world application of navigation over road networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes