AIFeb 20, 2024

XRL-Bench: A Benchmark for Evaluating and Comparing Explainable Reinforcement Learning Techniques

arXiv:2402.12685v15 citationsh-index: 16Has CodeKDD
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of evaluating explainable AI in reinforcement learning for researchers and practitioners, though it is incremental as it builds on existing XRL methods by providing a benchmark.

The paper tackles the lack of a unified evaluation framework for explainable reinforcement learning (XRL) methods by introducing XRL-Bench, a standardized benchmark for assessing and comparing state-explaining techniques, and demonstrates its utility with a new method called TabularSHAP in real-world online gaming services.

Reinforcement Learning (RL) has demonstrated substantial potential across diverse fields, yet understanding its decision-making process, especially in real-world scenarios where rationality and safety are paramount, is an ongoing challenge. This paper delves in to Explainable RL (XRL), a subfield of Explainable AI (XAI) aimed at unravelling the complexities of RL models. Our focus rests on state-explaining techniques, a crucial subset within XRL methods, as they reveal the underlying factors influencing an agent's actions at any given time. Despite their significant role, the lack of a unified evaluation framework hinders assessment of their accuracy and effectiveness. To address this, we introduce XRL-Bench, a unified standardized benchmark tailored for the evaluation and comparison of XRL methods, encompassing three main modules: standard RL environments, explainers based on state importance, and standard evaluators. XRL-Bench supports both tabular and image data for state explanation. We also propose TabularSHAP, an innovative and competitive XRL method. We demonstrate the practical utility of TabularSHAP in real-world online gaming services and offer an open-source benchmark platform for the straightforward implementation and evaluation of XRL methods. Our contributions facilitate the continued progression of XRL technology.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes