LGAIMLSep 26, 2020

Graph neural induction of value iteration

arXiv:2009.12604v112 citations
Originality Incremental advance
AI Analysis

This work addresses the need for more flexible and directly supervised planning components in deep reinforcement learning systems, though it appears incremental by extending existing neural planning methods to broader environments.

The paper tackled the problem of incorporating explicit planning into reinforcement learning by proposing a graph neural network that directly executes the value iteration algorithm across arbitrary environments, achieving accurate modeling and favorable metrics in out-of-distribution tests.

Many reinforcement learning tasks can benefit from explicit planning based on an internal model of the environment. Previously, such planning components have been incorporated through a neural network that partially aligns with the computational graph of value iteration. Such network have so far been focused on restrictive environments (e.g. grid-worlds), and modelled the planning procedure only indirectly. We relax these constraints, proposing a graph neural network (GNN) that executes the value iteration (VI) algorithm, across arbitrary environment models, with direct supervision on the intermediate steps of VI. The results indicate that GNNs are able to model value iteration accurately, recovering favourable metrics and policies across a variety of out-of-distribution tests. This suggests that GNN executors with strong supervision are a viable component within deep reinforcement learning systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes