LG AI MLJun 13, 2020

Reinforcement Learning as Iterative and Amortised Inference

Beren Millidge, Alexander Tschantz, Anil K Seth, Christopher L Buckley

arXiv:2006.10524v35.03 citations

Originality Synthesis-oriented

AI Analysis

This work provides a fresh perspective for researchers in reinforcement learning to understand and develop algorithms, though it is incremental as it builds on existing frameworks.

The paper tackles the problem of categorizing reinforcement learning algorithms by proposing a novel classification scheme based on amortised and iterative inference, using the control as inference framework, and demonstrates that this perspective can unify existing algorithms and suggest new research directions.

There are several ways to categorise reinforcement learning (RL) algorithms, such as either model-based or model-free, policy-based or planning-based, on-policy or off-policy, and online or offline. Broad classification schemes such as these help provide a unified perspective on disparate techniques and can contextualise and guide the development of new algorithms. In this paper, we utilise the control as inference framework to outline a novel classification scheme based on amortised and iterative inference. We demonstrate that a wide range of algorithms can be classified in this manner providing a fresh perspective and highlighting a range of existing similarities. Moreover, we show that taking this perspective allows us to identify parts of the algorithmic design space which have been relatively unexplored, suggesting new routes to innovative RL algorithms.

View on arXiv PDF

Similar