Compositional planning in Markov decision processes: Temporal abstraction meets generalized logic composition
This work addresses the challenge of efficient policy synthesis for complex tasks in AI planning, offering a method that could benefit robotics and autonomous systems, though it appears incremental by building on existing hierarchical and compositional techniques.
The paper tackles the problem of hierarchical planning in Markov decision processes under temporal logic constraints by introducing a novel approach that combines temporal abstraction with generalized logic composition, resulting in a synthesis algorithm that efficiently computes optimal policies by composing sub-policies, demonstrated through stochastic planning examples with improved efficiency.
In hierarchical planning for Markov decision processes (MDPs), temporal abstraction allows planning with macro-actions that take place at different time scale in form of sequential composition. In this paper, we propose a novel approach to compositional reasoning and hierarchical planning for MDPs under temporal logic constraints. In addition to sequential composition, we introduce a composition of policies based on generalized logic composition: Given sub-policies for sub-tasks and a new task expressed as logic compositions of subtasks, a semi-optimal policy, which is optimal in planning with only sub-policies, can be obtained by simply composing sub-polices. Thus, a synthesis algorithm is developed to compute optimal policies efficiently by planning with primitive actions, policies for sub-tasks, and the compositions of sub-policies, for maximizing the probability of satisfying temporal logic specifications. We demonstrate the correctness and efficiency of the proposed method in stochastic planning examples with a single agent and multiple task specifications.