LG AI MLJun 14, 2018

Self-Imitation Learning

Junhyuk Oh, Yijie Guo, Satinder Singh, Honglak Lee

arXiv:1806.05635v133.8298 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses exploration challenges in reinforcement learning for agents, but it is incremental as it builds on existing actor-critic methods.

The paper tackles the problem of exploration in reinforcement learning by proposing Self-Imitation Learning (SIL), which learns to reproduce past good decisions, and shows that it significantly improves A2C on hard exploration Atari games and is competitive with state-of-the-art methods.

This paper proposes Self-Imitation Learning (SIL), a simple off-policy actor-critic algorithm that learns to reproduce the agent's past good decisions. This algorithm is designed to verify our hypothesis that exploiting past good experiences can indirectly drive deep exploration. Our empirical results show that SIL significantly improves advantage actor-critic (A2C) on several hard exploration Atari games and is competitive to the state-of-the-art count-based exploration methods. We also show that SIL improves proximal policy optimization (PPO) on MuJoCo tasks.

View on arXiv PDF Code

Similar