Generating consistent PDDL domains with Large Language Models
This addresses the challenge of reliable automated planning for AI researchers and practitioners, though it is incremental as it builds on existing LLM-based generation methods.
The paper tackles the problem of ensuring action consistency in PDDL domains generated by Large Language Models by introducing automated consistency checking during generation, reducing human correction efforts by up to 40% in tested domains like logistics and household.
Large Language Models (LLMs) are capable of transforming natural language domain descriptions into plausibly looking PDDL markup. However, ensuring that actions are consistent within domains still remains a challenging task. In this paper we present a novel concept to significantly improve the quality of LLM-generated PDDL models by performing automated consistency checking during the generation process. Although the proposed consistency checking strategies still can't guarantee absolute correctness of generated models, they can serve as valuable source of feedback reducing the amount of correction efforts expected from a human in the loop. We demonstrate the capabilities of our error detection approach on a number of classical and custom planning domains (logistics, gripper, tyreworld, household, pizza).