AILGNov 23, 2022

Understanding Sample Generation Strategies for Learning Heuristic Functions in Classical Planning

arXiv:2211.13316v44 citationsh-index: 18
Originality Incremental advance
AI Analysis

This work addresses the problem of improving heuristic search efficiency in classical planning for AI researchers, though it is incremental as it builds on existing learning methods.

The paper investigates how sample generation strategies affect the performance of learned heuristic functions in classical planning, finding that both the algorithm used to generate samples and the accuracy of cost-to-goal estimates are key factors, and proposes practical strategies that increase mean coverage by over 30% compared to a baseline.

We study the problem of learning good heuristic functions for classical planning tasks with neural networks based on samples represented by states with their cost-to-goal estimates. The heuristic function is learned for a state space and goal condition with the number of samples limited to a fraction of the size of the state space, and must generalize well for all states of the state space with the same goal condition. Our main goal is to better understand the influence of sample generation strategies on the performance of a greedy best-first heuristic search (GBFS) guided by a learned heuristic function. In a set of controlled experiments, we find that two main factors determine the quality of the learned heuristic: the algorithm used to generate the sample set and how close the sample estimates to the perfect cost-to-goal are. These two factors are dependent: having perfect cost-to-goal estimates is insufficient if the samples are not well distributed across the state space. We also study other effects, such as adding samples with high-value estimates. Based on our findings, we propose practical strategies to improve the quality of learned heuristics: three strategies that aim to generate more representative states and two strategies that improve the cost-to-goal estimates. Our practical strategies result in a learned heuristic that, when guiding a GBFS algorithm, increases by more than 30% the mean coverage compared to a baseline learned heuristic.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes