LG AIMar 11, 2021

Generalizable Episodic Memory for Deep Reinforcement Learning

Hao Hu, Jianing Ye, Guangxiang Zhu, Zhizhou Ren, Chongjie Zhang

arXiv:2103.06469v317.544 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a bottleneck in reinforcement learning for continuous control and discrete action spaces, offering incremental improvements in sample efficiency.

The paper tackles the problem of episodic memory methods failing in continuous domains where states are never revisited, by proposing Generalizable Episodic Memory (GEM) to organize state-action values for implicit planning and reduce overestimation bias. The result shows significant performance improvements over baseline algorithms on MuJoCo continuous control tasks and Atari games.

Episodic memory-based methods can rapidly latch onto past successful strategies by a non-parametric memory and improve sample efficiency of traditional reinforcement learning. However, little effort is put into the continuous domain, where a state is never visited twice, and previous episodic methods fail to efficiently aggregate experience across trajectories. To address this problem, we propose Generalizable Episodic Memory (GEM), which effectively organizes the state-action values of episodic memory in a generalizable manner and supports implicit planning on memorized trajectories. GEM utilizes a double estimator to reduce the overestimation bias induced by value propagation in the planning process. Empirical evaluation shows that our method significantly outperforms existing trajectory-based methods on various MuJoCo continuous control tasks. To further show the general applicability, we evaluate our method on Atari games with discrete action space, which also shows a significant improvement over baseline algorithms.

View on arXiv PDF Code

Similar