LGAIJun 25, 2021

Compositional Reinforcement Learning from Logical Specifications

arXiv:2106.13906v3105 citations
Originality Incremental advance
AI Analysis

This addresses the scalability issue in reinforcement learning for complex logical tasks, offering a solution for domains requiring high-level planning, though it appears incremental as it builds on existing reward-shaping and planning methods.

The paper tackles the problem of learning control policies for complex tasks specified by logical formulas, which previous methods scaled poorly to due to high-level planning requirements. The proposed compositional approach, DiRL, interleaves planning and reinforcement learning, outperforming state-of-the-art baselines on challenging continuous control benchmarks.

We study the problem of learning control policies for complex tasks given by logical specifications. Recent approaches automatically generate a reward function from a given specification and use a suitable reinforcement learning algorithm to learn a policy that maximizes the expected reward. These approaches, however, scale poorly to complex tasks that require high-level planning. In this work, we develop a compositional learning approach, called DiRL, that interleaves high-level planning and reinforcement learning. First, DiRL encodes the specification as an abstract graph; intuitively, vertices and edges of the graph correspond to regions of the state space and simpler sub-tasks, respectively. Our approach then incorporates reinforcement learning to learn neural network policies for each edge (sub-task) within a Dijkstra-style planning algorithm to compute a high-level plan in the graph. An evaluation of the proposed approach on a set of challenging control benchmarks with continuous state and action spaces demonstrates that it outperforms state-of-the-art baselines.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes