Successor Feature Neural Episodic Control
This work addresses the challenge of building fast-learning and flexible agents in reinforcement learning, though it appears incremental as it combines existing techniques.
The paper tackles the problem of improving sample efficiency and transfer learning in reinforcement learning by integrating episodic control with successor features and generalized policy improvement, resulting in a combined framework that demonstrates benefits in empirical evaluations.
A longstanding goal in reinforcement learning is to build intelligent agents that show fast learning and a flexible transfer of skills akin to humans and animals. This paper investigates the integration of two frameworks for tackling those goals: episodic control and successor features. Episodic control is a cognitively inspired approach relying on episodic memory, an instance-based memory model of an agent's experiences. Meanwhile, successor features and generalized policy improvement (SF&GPI) is a meta and transfer learning framework allowing to learn policies for tasks that can be efficiently reused for later tasks which have a different reward function. Individually, these two techniques have shown impressive results in vastly improving sample efficiency and the elegant reuse of previously learned policies. Thus, we outline a combination of both approaches in a single reinforcement learning framework and empirically illustrate its benefits.