AI LGJun 15, 2020

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Jin Zhang, Jianhao Wang, Hao Hu, Tong Chen, Yingfeng Chen, Changjie Fan, Chongjie Zhang

arXiv:2006.08170v523.733 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses a key bottleneck in meta-RL for sparse-reward environments, offering incremental improvements in exploration efficiency.

The paper tackles the challenge of efficient exploration in meta reinforcement learning for sparse-reward tasks by introducing an empowerment-driven exploration objective to maximize information gain for task identification, resulting in a method that significantly outperforms state-of-the-art baselines on MuJoCo and Meta-World tasks.

Meta reinforcement learning (meta-RL) extracts knowledge from previous tasks and achieves fast adaptation to new tasks. Despite recent progress, efficient exploration in meta-RL remains a key challenge in sparse-reward tasks, as it requires quickly finding informative task-relevant experiences in both meta-training and adaptation. To address this challenge, we explicitly model an exploration policy learning problem for meta-RL, which is separated from exploitation policy learning, and introduce a novel empowerment-driven exploration objective, which aims to maximize information gain for task identification. We derive a corresponding intrinsic reward and develop a new off-policy meta-RL framework, which efficiently learns separate context-aware exploration and exploitation policies by sharing the knowledge of task inference. Experimental evaluation shows that our meta-RL method significantly outperforms state-of-the-art baselines on various sparse-reward MuJoCo locomotion tasks and more complex sparse-reward Meta-World tasks.

View on arXiv PDF Code

Similar