LGNEMLAug 4, 2021

A Pragmatic Look at Deep Imitation Learning

arXiv:2108.01867v212 citations
Originality Synthesis-oriented
AI Analysis

This work provides a standardized evaluation for researchers in imitation learning, though it is incremental as it focuses on fair comparison rather than introducing new methods.

The authors tackled the problem of inconsistent comparisons in deep imitation learning by re-implementing and standardizing six algorithms on a common benchmark, finding that GAIL performs consistently well, AdRIL is competitive with tuning, and behavioral cloning remains effective with ample data.

The introduction of the generative adversarial imitation learning (GAIL) algorithm has spurred the development of scalable imitation learning approaches using deep neural networks. Many of the algorithms that followed used a similar procedure, combining on-policy actor-critic algorithms with inverse reinforcement learning. More recently there have been an even larger breadth of approaches, most of which use off-policy algorithms. However, with the breadth of algorithms, everything from datasets to base reinforcement learning algorithms to evaluation settings can vary, making it difficult to fairly compare them. In this work we re-implement 6 different IL algorithms, updating 3 of them to be off-policy, base them on a common off-policy algorithm (SAC), and evaluate them on a widely-used expert trajectory dataset (D4RL) for the most common benchmark (MuJoCo). After giving all algorithms the same hyperparameter optimisation budget, we compare their results for a range of expert trajectories. In summary, GAIL, with all of its improvements, consistently performs well across a range of sample sizes, AdRIL is a simple contender that performs well with one important hyperparameter to tune, and behavioural cloning remains a strong baseline when data is more plentiful.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes