LGAIJul 20, 2023

A Definition of Continual Reinforcement Learning

DeepMindStanford
arXiv:2307.11046v2142 citationsh-index: 65
Originality Incremental advance
AI Analysis

This foundational work addresses a conceptual gap for researchers in reinforcement learning, though it is incremental as it focuses on definition rather than new methods.

The paper tackles the lack of a clear definition for continual reinforcement learning by formalizing it as a setting where agents never stop learning, using a new mathematical language to define agents and showing that traditional multi-task and supervised learning are special cases.

In a standard view of the reinforcement learning problem, an agent's goal is to efficiently identify a policy that maximizes long-term reward. However, this perspective is based on a restricted view of learning as finding a solution, rather than treating learning as endless adaptation. In contrast, continual reinforcement learning refers to the setting in which the best agents never stop learning. Despite the importance of continual reinforcement learning, the community lacks a simple definition of the problem that highlights its commitments and makes its primary concepts precise and clear. To this end, this paper is dedicated to carefully defining the continual reinforcement learning problem. We formalize the notion of agents that "never stop learning" through a new mathematical language for analyzing and cataloging agents. Using this new language, we define a continual learning agent as one that can be understood as carrying out an implicit search process indefinitely, and continual reinforcement learning as the setting in which the best agents are all continual learning agents. We provide two motivating examples, illustrating that traditional views of multi-task reinforcement learning and continual supervised learning are special cases of our definition. Collectively, these definitions and perspectives formalize many intuitive concepts at the heart of learning, and open new research pathways surrounding continual learning agents.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes