$f$-GAIL: Learning $f$-Divergence for Generative Adversarial Imitation Learning
This work addresses the challenge of improving imitation learning efficiency and performance for robotics and control applications, though it is incremental as it builds on existing GAIL frameworks.
The paper tackles the problem of selecting an appropriate divergence measure for imitation learning by proposing $f$-GAIL, which automatically learns an $f$-divergence from expert demonstrations, resulting in better policies with higher data efficiency across six physics-based control tasks.
Imitation learning (IL) aims to learn a policy from expert demonstrations that minimizes the discrepancy between the learner and expert behaviors. Various imitation learning algorithms have been proposed with different pre-determined divergences to quantify the discrepancy. This naturally gives rise to the following question: Given a set of expert demonstrations, which divergence can recover the expert policy more accurately with higher data efficiency? In this work, we propose $f$-GAIL, a new generative adversarial imitation learning (GAIL) model, that automatically learns a discrepancy measure from the $f$-divergence family as well as a policy capable of producing expert-like behaviors. Compared with IL baselines with various predefined divergence measures, $f$-GAIL learns better policies with higher data efficiency in six physics-based control tasks.