LGAIMLJan 5, 2020

Universal Successor Features for Transfer Reinforcement Learning

arXiv:2001.04025v129 citations
AI Analysis

This addresses the challenge of knowledge transfer across tasks in reinforcement learning, offering a method to improve efficiency, though it is incremental by building on existing successor features and universal value functions.

The paper tackles the problem of transfer in reinforcement learning by proposing Universal Successor Features (USFs) to generalize over goals and states, showing in experiments with gridworld and MuJoCo environments that USFs greatly accelerate training and effectively transfer knowledge to new tasks.

Transfer in Reinforcement Learning (RL) refers to the idea of applying knowledge gained from previous tasks to solving related tasks. Learning a universal value function (Schaul et al., 2015), which generalizes over goals and states, has previously been shown to be useful for transfer. However, successor features are believed to be more suitable than values for transfer (Dayan, 1993; Barreto et al.,2017), even though they cannot directly generalize to new goals. In this paper, we propose (1) Universal Successor Features (USFs) to capture the underlying dynamics of the environment while allowing generalization to unseen goals and (2) a flexible end-to-end model of USFs that can be trained by interacting with the environment. We show that learning USFs is compatible with any RL algorithm that learns state values using a temporal difference method. Our experiments in a simple gridworld and with two MuJoCo environments show that USFs can greatly accelerate training when learning multiple tasks and can effectively transfer knowledge to new tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes