LGAIMLOct 26, 2020

Forethought and Hindsight in Credit Assignment

arXiv:2010.13685v129 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a fundamental challenge in reinforcement learning for AI agents, but it appears incremental as it builds on existing planning methods without claiming broad SOTA breakthroughs.

The paper tackled the problem of credit assignment in reinforcement learning by comparing forethought (forward models) and hindsight (backward models) planning mechanisms, establishing their relative merits and limitations in constructed scenarios.

We address the problem of credit assignment in reinforcement learning and explore fundamental questions regarding the way in which an agent can best use additional computation to propagate new information, by planning with internal models of the world to improve its predictions. Particularly, we work to understand the gains and peculiarities of planning employed as forethought via forward models or as hindsight operating with backward models. We establish the relative merits, limitations and complementary properties of both planning mechanisms in carefully constructed scenarios. Further, we investigate the best use of models in planning, primarily focusing on the selection of states in which predictions should be (re)-evaluated. Lastly, we discuss the issue of model estimation and highlight a spectrum of methods that stretch from explicit environment-dynamics predictors to more abstract planner-aware models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes