LGAIMLMay 16, 2019

Meta Reinforcement Learning with Task Embedding and Shared Policy

arXiv:1905.06527v355 citations
Originality Incremental advance
AI Analysis

This work addresses a key bottleneck in reinforcement learning for researchers and practitioners by improving generalization and efficiency, though it is incremental as it builds on existing meta-RL approaches.

The paper tackles the problem of data inefficiency and limited generalization in deep reinforcement learning by proposing a meta-RL method that captures shared information and quickly abstracts task-specific details, resulting in up to 3 to 4 times higher returns on simulated tasks compared to baselines.

Despite significant progress, deep reinforcement learning (RL) suffers from data-inefficiency and limited generalization. Recent efforts apply meta-learning to learn a meta-learner from a set of RL tasks such that a novel but related task could be solved quickly. Though specific in some ways, different tasks in meta-RL are generally similar at a high level. However, most meta-RL methods do not explicitly and adequately model the specific and shared information among different tasks, which limits their ability to learn training tasks and to generalize to novel tasks. In this paper, we propose to capture the shared information on the one hand and meta-learn how to quickly abstract the specific information about a task on the other hand. Methodologically, we train an SGD meta-learner to quickly optimize a task encoder for each task, which generates a task embedding based on past experience. Meanwhile, we learn a policy which is shared across all tasks and conditioned on task embeddings. Empirical results on four simulated tasks demonstrate that our method has better learning capacity on both training and novel tasks and attains up to 3 to 4 times higher returns compared to baselines.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes