LGMLJun 30, 2020

MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning

arXiv:2006.16908v2192 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of slow convergence in reinforcement learning for researchers and practitioners by proposing a method to exploit symmetries, though it is incremental as it builds on existing equivariance concepts.

The paper tackles the problem of inefficient deep reinforcement learning by introducing MDP homomorphic networks that incorporate group symmetries as prior knowledge, resulting in faster convergence on tasks like CartPole, a grid world, and Pong compared to unstructured baselines.

This paper introduces MDP homomorphic networks for deep reinforcement learning. MDP homomorphic networks are neural networks that are equivariant under symmetries in the joint state-action space of an MDP. Current approaches to deep reinforcement learning do not usually exploit knowledge about such structure. By building this prior knowledge into policy and value networks using an equivariance constraint, we can reduce the size of the solution space. We specifically focus on group-structured symmetries (invertible transformations). Additionally, we introduce an easy method for constructing equivariant network layers numerically, so the system designer need not solve the constraints by hand, as is typically done. We construct MDP homomorphic MLPs and CNNs that are equivariant under either a group of reflections or rotations. We show that such networks converge faster than unstructured baselines on CartPole, a grid world and Pong.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes