CLFeb 29, 2024

PROC2PDDL: Open-Domain Planning Representations from Texts

arXiv:2403.00092v230 citationsh-index: 67NLRSE
Originality Synthesis-oriented
AI Analysis

This addresses the problem of AI planning in text-based environments for researchers by providing a challenging benchmark, though it is incremental as it builds on existing methods with new data.

The authors tackled the challenge of generating planning domain definitions from open-domain procedural texts by introducing Proc2PDDL, a dataset with expert-annotated PDDL representations, and found that state-of-the-art models like GPT-3.5 and GPT-4 achieved success rates close to 0% and around 35%, respectively.

Planning in a text-based environment continues to be a major challenge for AI systems. Recent approaches have used language models to predict a planning domain definition (e.g., PDDL) but have only been evaluated in closed-domain simulated environments. To address this, we present Proc2PDDL , the first dataset containing open-domain procedural texts paired with expert-annotated PDDL representations. Using this dataset, we evaluate state-of-the-art models on defining the preconditions and effects of actions. We show that Proc2PDDL is highly challenging, with GPT-3.5's success rate close to 0% and GPT-4's around 35%. Our analysis shows both syntactic and semantic errors, indicating LMs' deficiency in both generating domain-specific prgorams and reasoning about events. We hope this analysis and dataset helps future progress towards integrating the best of LMs and formal planning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes