Partial Counterfactual Identification for Infinite Horizon Partially Observable Markov Decision Process
This work addresses a limitation in causal inference for sequential decision-making problems, though it appears incremental as it extends existing methods to infinite horizons.
The paper tackles the problem of bounding counterfactual queries from observational data in infinite-horizon partially observable Markov decision processes, extending prior finite-horizon methods by modifying Q-learning algorithms, and demonstrates through simulations that the proposed algorithms outperform existing ones.
This paper investigates the problem of bounding possible output from a counterfactual query given a set of observational data. While various works of literature have described methodologies to generate efficient algorithms that provide an optimal bound for the counterfactual query, all of them assume a finite-horizon causal diagram. This paper aims to extend the previous work by modifying Q-learning algorithm to provide informative bounds of a causal query given an infinite-horizon causal diagram. Through simulations, our algorithms are proven to perform better compared to existing algorithm.