LG AIFeb 28, 2020

Efficiently Guiding Imitation Learning Agents with Human Gaze

Akanksha Saran, Ruohan Zhang, Elaine Schaertl Short, Scott Niekum

arXiv:2002.12500v411.519 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of enhancing imitation learning efficiency and performance for AI agents, particularly in tasks like video games, by leveraging human gaze data without adding computational overhead at test time.

The paper tackles the problem of improving imitation learning agents by using human gaze cues as an auxiliary loss, resulting in performance improvements of 95% for BC, 343% for BCO, and 390% for T-REX averaged over 20 Atari games, and outperforming a prior state-of-the-art method.

Human gaze is known to be an intention-revealing signal in human demonstrations of tasks. In this work, we use gaze cues from human demonstrators to enhance the performance of agents trained via three popular imitation learning methods -- behavioral cloning (BC), behavioral cloning from observation (BCO), and Trajectory-ranked Reward EXtrapolation (T-REX). Based on similarities between the attention of reinforcement learning agents and human gaze, we propose a novel approach for utilizing gaze data in a computationally efficient manner, as part of an auxiliary loss function, which guides a network to have higher activations in image regions where the human's gaze fixated. This work is a step towards augmenting any existing convolutional imitation learning agent's training with auxiliary gaze data. Our auxiliary coverage-based gaze loss (CGL) guides learning toward a better reward function or policy, without adding any additional learnable parameters and without requiring gaze data at test time. We find that our proposed approach improves the performance by 95% for BC, 343% for BCO, and 390% for T-REX, averaged over 20 different Atari games. We also find that compared to a prior state-of-the-art imitation learning method assisted by human gaze (AGIL), our method achieves better performance, and is more efficient in terms of learning with fewer demonstrations. We further interpret trained CGL agents with a saliency map visualization method to explain their performance. At last, we show that CGL can help alleviate a well-known causal confusion problem in imitation learning.

View on arXiv PDF

Similar