Imitation with Neural Density Models
This addresses the problem of efficient imitation learning for robotics and control systems, with incremental improvements in method efficiency.
The paper tackles imitation learning by proposing a framework that uses density estimation of the expert's occupancy measure and maximum occupancy entropy reinforcement learning, achieving state-of-the-art demonstration efficiency on benchmark control tasks.
We propose a new framework for Imitation Learning (IL) via density estimation of the expert's occupancy measure followed by Maximum Occupancy Entropy Reinforcement Learning (RL) using the density as a reward. Our approach maximizes a non-adversarial model-free RL objective that provably lower bounds reverse Kullback-Leibler divergence between occupancy measures of the expert and imitator. We present a practical IL algorithm, Neural Density Imitation (NDI), which obtains state-of-the-art demonstration efficiency on benchmark control tasks.