LG OC MLJan 31, 2023

Combinatorial Causal Bandits without Graph Skeleton

arXiv:2301.13392v410.76 citationsh-index: 5Has Code

Originality Incremental advance

AI Analysis

This work addresses a key limitation in causal inference for decision-making by enabling effective learning in combinatorial settings without unrealistic structural assumptions, though it is incremental as it builds on existing bandit frameworks.

The paper tackles the combinatorial causal bandits problem without requiring prior knowledge of the causal graph structure, achieving an O(√T ln T) expected regret under a weight gap assumption and O(T^{2/3} ln T) regret without it, matching state-of-the-art results that rely on graph knowledge.

In combinatorial causal bandits (CCB), the learning agent chooses a subset of variables in each round to intervene and collects feedback from the observed variables to minimize expected regret or sample complexity. Previous works study this problem in both general causal models and binary generalized linear models (BGLMs). However, all of them require prior knowledge of causal graph structure or unrealistic assumptions. This paper studies the CCB problem without the graph structure on binary general causal models and BGLMs. We first provide an exponential lower bound of cumulative regrets for the CCB problem on general causal models. To overcome the exponentially large space of parameters, we then consider the CCB problem on BGLMs. We design a regret minimization algorithm for BGLMs even without the graph skeleton and show that it still achieves $O(\sqrt{T}\ln T)$ expected regret, as long as the causal graph satisfies a weight gap assumption. This asymptotic regret is the same as the state-of-art algorithms relying on the graph structure. Moreover, we propose another algorithm with $O(T^{\frac{2}{3}}\ln T)$ regret to remove the weight gap assumption.

View on arXiv PDF Code

Similar