AISYDec 16, 2022

Causal Temporal Reasoning for Markov Decision Processes

arXiv:2212.08712v22 citationsh-index: 7
AI Analysis

This work addresses the limitation of existing probabilistic temporal logics that cannot reason about different system configurations, which is crucial for applications like safe reinforcement learning.

The authors tackled the problem of verifying Markov Decision Processes (MDPs) by introducing PCFTL, a new probabilistic temporal logic that includes operators for causal reasoning, enabling interventional and counterfactual queries, and demonstrated its application in safe reinforcement learning on grid-world benchmarks.

We introduce $\textit{PCFTL (Probabilistic CounterFactual Temporal Logic)}$, a new probabilistic temporal logic for the verification of Markov Decision Processes (MDP). PCFTL is the first to include operators for causal reasoning, allowing us to express interventional and counterfactual queries. Given a path formula $φ$, an interventional property is concerned with the satisfaction probability of $φ$ if we apply a particular change $I$ to the MDP (e.g., switching to a different policy); a counterfactual allows us to compute, given an observed MDP path $τ$, what the outcome of $φ$ would have been had we applied $I$ in the past. For its ability to reason about \textit{what-if} scenarios involving different configurations of the MDP, our approach represents a departure from existing probabilistic temporal logics that can only reason about a fixed system configuration. From a syntactic viewpoint, we introduce a generalized counterfactual operator that subsumes both interventional and counterfactual probabilities as well as the traditional probabilistic operator found in e.g., PCTL. From a semantics viewpoint, our logic is interpreted over a structural causal model translation of the MDP, which gives us a representation amenable to counterfactual reasoning. We evaluate PCFTL in the context of safe reinforcement learning using a benchmark of grid-world models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes