AIJun 19, 2020

Modelling Agent Policies with Interpretable Imitation Learning

arXiv:2006.11309v110 citations
Originality Incremental advance
AI Analysis

This addresses the need for transparency in autonomous agents for safety-critical applications, though it appears incremental as it builds on existing imitation learning and interpretability techniques.

The paper tackles the problem of understanding black box agent policies in safety-critical domains by developing an interpretable imitation learning method that yields decision tree models, with initial promising results demonstrated in a multi-agent traffic environment.

As we deploy autonomous agents in safety-critical domains, it becomes important to develop an understanding of their internal mechanisms and representations. We outline an approach to imitation learning for reverse-engineering black box agent policies in MDP environments, yielding simplified, interpretable models in the form of decision trees. As part of this process, we explicitly model and learn agents' latent state representations by selecting from a large space of candidate features constructed from the Markov state. We present initial promising results from an implementation in a multi-agent traffic environment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes