Resolving Spurious Correlations in Causal Models of Environments via Interventions
This addresses the issue of spurious correlations in causal models for decision-making systems, which is incremental as it builds on existing causal modeling approaches.
The paper tackles the problem of spurious correlations causing errors in causal models of reinforcement learning environments by proposing a method that designs reward functions to incentivize interventions, which improve the models; experimental results in a grid-world show better causal models compared to baselines.
Causal models bring many benefits to decision-making systems (or agents) by making them interpretable, sample-efficient, and robust to changes in the input distribution. However, spurious correlations can lead to wrong causal models and predictions. We consider the problem of inferring a causal model of a reinforcement learning environment and we propose a method to deal with spurious correlations. Specifically, our method designs a reward function that incentivizes an agent to do an intervention to find errors in the causal model. The data obtained from doing the intervention is used to improve the causal model. We propose several intervention design methods and compare them. The experimental results in a grid-world environment show that our approach leads to better causal models compared to baselines: learning the model on data from a random policy or a policy trained on the environment's reward. The main contribution consists of methods to design interventions to resolve spurious correlations.