LGAILOJun 20, 2022

Policy Optimization with Linear Temporal Logic Constraints

arXiv:2206.09546v225 citationsh-index: 37
Originality Incremental advance
AI Analysis

It addresses the problem of flexible task specification in reinforcement learning for researchers and practitioners, offering an incremental improvement over cost shaping methods.

The paper tackles policy optimization with linear temporal logic constraints to decouple task specification from policy selection, achieving strong performance in low-sample regimes with a sample complexity guarantee for task satisfaction and cost optimality.

We study the problem of policy optimization (PO) with linear temporal logic (LTL) constraints. The language of LTL allows flexible description of tasks that may be unnatural to encode as a scalar cost function. We consider LTL-constrained PO as a systematic framework, decoupling task specification from policy selection, and as an alternative to the standard of cost shaping. With access to a generative model, we develop a model-based approach that enjoys a sample complexity analysis for guaranteeing both task satisfaction and cost optimality (through a reduction to a reachability problem). Empirically, our algorithm can achieve strong performance even in low-sample regimes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes