AILGApr 3, 2024

Learning Generalized Policies for Fully Observable Non-Deterministic Planning Domains

arXiv:2404.02499v24 citationsh-index: 51IJCAI
AI Analysis

This work addresses the challenge of developing reactive strategies for large families of FOND planning problems, representing an incremental extension of methods from classical to non-deterministic domains.

The authors tackled the problem of learning general policies for fully observable non-deterministic (FOND) planning domains by extending existing combinatorial methods, resulting in a new approach that searches in an abstract feature space and proving correctness with experimental evaluation on benchmark domains.

General policies represent reactive strategies for solving large families of planning problems like the infinite collection of solvable instances from a given domain. Methods for learning such policies from a collection of small training instances have been developed successfully for classical domains. In this work, we extend the formulations and the resulting combinatorial methods for learning general policies over fully observable, non-deterministic (FOND) domains. We also evaluate the resulting approach experimentally over a number of benchmark domains in FOND planning, present the general policies that result in some of these domains, and prove their correctness. The method for learning general policies for FOND planning can actually be seen as an alternative FOND planning method that searches for solutions, not in the given state space but in an abstract space defined by features that must be learned as well.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes