LGAIHCNERONov 17, 2020

Explaining Conditions for Reinforcement Learning Behaviors from Real and Imagined Data

arXiv:2011.09004v14 citations
AI Analysis

This work addresses the problem of calibrating user trust and expectations for real-world RL deployments by making RL system competencies more transparent to humans.

This paper introduces a method to generate human-interpretable abstract behavior models for reinforcement learning (RL) systems. These models identify the experiential conditions that lead to different task execution strategies and outcomes, using both real and imagined trajectory data.

The deployment of reinforcement learning (RL) in the real world comes with challenges in calibrating user trust and expectations. As a step toward developing RL systems that are able to communicate their competencies, we present a method of generating human-interpretable abstract behavior models that identify the experiential conditions leading to different task execution strategies and outcomes. Our approach consists of extracting experiential features from state representations, abstracting strategy descriptors from trajectories, and training an interpretable decision tree that identifies the conditions most predictive of different RL behaviors. We demonstrate our method on trajectory data generated from interactions with the environment and on imagined trajectory data that comes from a trained probabilistic world model in a model-based RL setting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes