LGJun 9, 2023

Explaining Reinforcement Learning with Shapley Values

arXiv:2306.05810v146 citationsh-index: 17
Originality Incremental advance
AI Analysis

This addresses the need for interpretability in reinforcement learning for users, though it appears incremental as it builds on existing Shapley value methods.

The paper tackles the problem of explaining reinforcement learning systems to improve user trust by proposing SVERL, a framework based on Shapley values from game theory, which produces meaningful explanations that align with human intuition in various domains.

For reinforcement learning systems to be widely adopted, their users must understand and trust them. We present a theoretical analysis of explaining reinforcement learning using Shapley values, following a principled approach from game theory for identifying the contribution of individual players to the outcome of a cooperative game. We call this general framework Shapley Values for Explaining Reinforcement Learning (SVERL). Our analysis exposes the limitations of earlier uses of Shapley values in reinforcement learning. We then develop an approach that uses Shapley values to explain agent performance. In a variety of domains, SVERL produces meaningful explanations that match and supplement human intuition.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes