AIOct 19, 2025

A Comparative User Evaluation of XRL Explanations using Goal Identification

arXiv:2510.16956v1h-index: 2
Originality Synthesis-oriented
AI Analysis

This addresses the need for better comparative evaluations of XRL methods for debugging, though it is incremental as it focuses on a specific methodology and environment.

The study tackled the problem of evaluating explainable reinforcement learning (XRL) algorithms for debugging by testing if users could identify an agent's goal from explanations, finding that only one algorithm achieved greater than random accuracy and users were overconfident with no correlation between self-reported ease and accuracy.

Debugging is a core application of explainable reinforcement learning (XRL) algorithms; however, limited comparative evaluations have been conducted to understand their relative performance. We propose a novel evaluation methodology to test whether users can identify an agent's goal from an explanation of its decision-making. Utilising the Atari's Ms. Pacman environment and four XRL algorithms, we find that only one achieved greater than random accuracy for the tested goals and that users were generally overconfident in their selections. Further, we find that users' self-reported ease of identification and understanding for every explanation did not correlate with their accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes