LG AI MLFeb 11, 2020

HMRL: Hyper-Meta Learning for Sparse Reward Reinforcement Learning Problem

Yun Hua, Xiangfeng Wang, Bo Jin, Wenhao Li, Junchi Yan, Xiaofeng He, Hongyuan Zha

arXiv:2002.04238v23.310 citations

Originality Incremental advance

AI Analysis

This addresses the problem of inefficient learning in sparse reward RL for researchers, but it is incremental as it builds on existing meta RL methods.

The paper tackles the difficulty of meta reinforcement learning in sparse reward settings by introducing HMRL, a framework that constructs a common meta state space and uses meta reward shaping, resulting in improved transferability and policy learning efficiency in experiments.

In spite of the success of existing meta reinforcement learning methods, they still have difficulty in learning a meta policy effectively for RL problems with sparse reward. In this respect, we develop a novel meta reinforcement learning framework called Hyper-Meta RL(HMRL), for sparse reward RL problems. It is consisted with three modules including the cross-environment meta state embedding module which constructs a common meta state space to adapt to different environments; the meta state based environment-specific meta reward shaping which effectively extends the original sparse reward trajectory by cross-environmental knowledge complementarity and as a consequence the meta policy achieves better generalization and efficiency with the shaped meta reward. Experiments with sparse-reward environments show the superiority of HMRL on both transferability and policy learning efficiency.

View on arXiv PDF

Similar