MLLGJun 10, 2016

Causal Bandits: Learning Good Interventions via Causal Inference

arXiv:1606.03203v1189 citations
Originality Highly original
AI Analysis

This work addresses the challenge of efficient intervention learning for decision-making systems, representing an incremental improvement by integrating causal inference into bandit frameworks.

The paper tackles the problem of learning optimal interventions online in stochastic environments by combining causal models with multi-arm bandits, proposing a new algorithm that exploits causal feedback and proves a strictly better simple regret bound compared to non-causal methods.

We study the problem of using causal models to improve the rate at which good interventions can be learned online in a stochastic environment. Our formalism combines multi-arm bandits and causal inference to model a novel type of bandit feedback that is not exploited by existing approaches. We propose a new algorithm that exploits the causal feedback and prove a bound on its simple regret that is strictly better (in all quantities) than algorithms that do not use the additional causal information.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes