LGAIMLJun 22, 2018

Many-Goals Reinforcement Learning

arXiv:1806.09605v156 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of scaling multi-goal learning in deep RL for researchers, but it is incremental as it builds on prior tabular methods without introducing a fundamentally new approach.

The paper tackles the challenge of extending all-goals updating from tabular to deep reinforcement learning by exploring three extensions: achieving mastery in visual domains, pre-training for faster learning, and using auxiliary tasks for improved performance, with comparisons to baselines provided.

All-goals updating exploits the off-policy nature of Q-learning to update all possible goals an agent could have from each transition in the world, and was introduced into Reinforcement Learning (RL) by Kaelbling (1993). In prior work this was mostly explored in small-state RL problems that allowed tabular representations and where all possible goals could be explicitly enumerated and learned separately. In this paper we empirically explore 3 different extensions of the idea of updating many (instead of all) goals in the context of RL with deep neural networks (or DeepRL for short). First, in a direct adaptation of Kaelbling's approach we explore if many-goals updating can be used to achieve mastery in non-tabular visual-observation domains. Second, we explore whether many-goals updating can be used to pre-train a network to subsequently learn faster and better on a single main task of interest. Third, we explore whether many-goals updating can be used to provide auxiliary task updates in training a network to learn faster and better on a single main task of interest. We provide comparisons to baselines for each of the 3 extensions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes