AIMay 10

Attribution-based Explanations for Markov Decision Processes

Paul Kobialka, Andrea Pferscher, Francesco Leofante, Erika Ábrahám, Silvia Lizeth Tapia Tarifa, Einar Broch Johnsen

arXiv:2605.097805.5

AI Analysis

For researchers and practitioners in AI explainability, this work addresses the lack of attribution methods for sequential decision-making, though it is an incremental extension of existing attribution techniques to MDPs.

This paper introduces attribution-based explanations for Markov Decision Processes (MDPs), assigning importance scores to states and execution paths. The approach is evaluated on five case studies, demonstrating interpretable insights into sequential decision-making.

Attribution techniques explain the outcome of an AI model by assigning a numerical score to its inputs. So far, these techniques have mainly focused on attributing importance to static input features at a single point in time, and thus fail to generalize to sequential decision-making settings. This paper fills this gap by introducing techniques to generate attribution-based explanations for Markov Decision Processes (MDPs). We give a formal characterization of what attributions should represent in MDPs, focusing on explanations that assign importance scores to both individual states and execution paths. We show how importance scores can be computed by leveraging techniques for strategy synthesis, enabling the efficient computation of these scores despite the non-determinism inherent in an MDP. We evaluate our approach on five case-studies, demonstrating its utility in providing interpretable insights into the logic of sequential decision-making agents.

View on arXiv PDF

Similar