AIHCROJan 31, 2025

Objective Metrics for Human-Subjects Evaluation in Explainable Reinforcement Learning

arXiv:2501.19256v14 citationsh-index: 4
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of unreliable evaluation methods in XRL for researchers and practitioners, though it is incremental as it builds on existing critique by proposing specific methodologies.

The paper tackles the lack of objective human evaluation in explainable reinforcement learning (XRL) by advocating for and curating objective metrics based on observable behavior, such as debugging agent behavior and human-agent teaming, to improve reproducibility and comparability in research.

Explanation is a fundamentally human process. Understanding the goal and audience of the explanation is vital, yet existing work on explainable reinforcement learning (XRL) routinely does not consult humans in their evaluations. Even when they do, they routinely resort to subjective metrics, such as confidence or understanding, that can only inform researchers of users' opinions, not their practical effectiveness for a given problem. This paper calls on researchers to use objective human metrics for explanation evaluations based on observable and actionable behaviour to build more reproducible, comparable, and epistemically grounded research. To this end, we curate, describe, and compare several objective evaluation methodologies for applying explanations to debugging agent behaviour and supporting human-agent teaming, illustrating our proposed methods using a novel grid-based environment. We discuss how subjective and objective metrics complement each other to provide holistic validation and how future work needs to utilise standardised benchmarks for testing to enable greater comparisons between research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes