Graph Convolutional Memory using Topological Priors
This addresses the challenge of incomplete world views in real-world reinforcement learning applications, offering a novel memory model that improves performance with human priors, though it is incremental when priors are not used.
The paper tackles the problem of solving partially-observable Markov decision processes (POMDPs) in reinforcement learning by introducing graph convolutional memory (GCM), a hybrid memory model that uses topological priors to form graph neighborhoods and query memories via graph convolution. The result shows that GCM with human priors outperforms state-of-the-art methods on control, memorization, and navigation tasks while using significantly fewer parameters.
Solving partially-observable Markov decision processes (POMDPs) is critical when applying reinforcement learning to real-world problems, where agents have an incomplete view of the world. We present graph convolutional memory (GCM), the first hybrid memory model for solving POMDPs using reinforcement learning. GCM uses either human-defined or data-driven topological priors to form graph neighborhoods, combining them into a larger network topology using dynamic programming. We query the graph using graph convolution, coalescing relevant memories into a context-dependent belief. When used without human priors, GCM performs similarly to state-of-the-art methods. When used with human priors, GCM outperforms these methods on control, memorization, and navigation tasks while using significantly fewer parameters.