ML LGJun 10, 2016

Causal Bandits: Learning Good Interventions via Causal Inference

Finnian Lattimore, Tor Lattimore, Mark D. Reid

arXiv:1606.03203v128.8189 citations

Originality Highly original

AI Analysis

This work addresses the challenge of efficient intervention learning for decision-making systems, representing an incremental improvement by integrating causal inference into bandit frameworks.

The paper tackles the problem of learning optimal interventions online in stochastic environments by combining causal models with multi-arm bandits, proposing a new algorithm that exploits causal feedback and proves a strictly better simple regret bound compared to non-causal methods.

We study the problem of using causal models to improve the rate at which good interventions can be learned online in a stochastic environment. Our formalism combines multi-arm bandits and causal inference to model a novel type of bandit feedback that is not exploited by existing approaches. We propose a new algorithm that exploits the causal feedback and prove a bound on its simple regret that is strictly better (in all quantities) than algorithms that do not use the additional causal information.

View on arXiv PDF

Similar