LGJun 22, 2022

Auto-Encoding Adversarial Imitation Learning

Kaifeng Zhang, Rui Zhao, Ziming Zhang, Yang Gao

arXiv:2206.11004v53.32 citationsh-index: 67

Originality Incremental advance

AI Analysis

This addresses the challenge of designing reward functions in RL for practitioners, though it is an incremental improvement on existing adversarial imitation learning methods.

The paper tackles the problem of automatic policy acquisition in reinforcement learning without access to reward signals by proposing Auto-Encoding Adversarial Imitation Learning (AEAIL), which uses auto-encoder reconstruction error as a reward signal, resulting in superior performance and better robustness with noisy demonstrations compared to state-of-the-art methods.

Reinforcement learning (RL) provides a powerful framework for decision-making, but its application in practice often requires a carefully designed reward function. Adversarial Imitation Learning (AIL) sheds light on automatic policy acquisition without access to the reward signal from the environment. In this work, we propose Auto-Encoding Adversarial Imitation Learning (AEAIL), a robust and scalable AIL framework. To induce expert policies from demonstrations, AEAIL utilizes the reconstruction error of an auto-encoder as a reward signal, which provides more information for optimizing policies than the prior discriminator-based ones. Subsequently, we use the derived objective functions to train the auto-encoder and the agent policy. Experiments show that our AEAIL performs superior compared to state-of-the-art methods on both state and image based environments. More importantly, AEAIL shows much better robustness when the expert demonstrations are noisy.

View on arXiv PDF

Similar