CLFeb 29, 2024

PROC2PDDL: Open-Domain Planning Representations from Texts

Tianyi Zhang, Li Zhang, Zhaoyi Hou, Ziyu Wang, Yuling Gu, Peter Clark, Chris Callison-Burch, Niket Tandon

arXiv:2403.00092v215.930 citationsh-index: 67NLRSE

Originality Synthesis-oriented

AI Analysis

This addresses the problem of AI planning in text-based environments for researchers by providing a challenging benchmark, though it is incremental as it builds on existing methods with new data.

The authors tackled the challenge of generating planning domain definitions from open-domain procedural texts by introducing Proc2PDDL, a dataset with expert-annotated PDDL representations, and found that state-of-the-art models like GPT-3.5 and GPT-4 achieved success rates close to 0% and around 35%, respectively.

Planning in a text-based environment continues to be a major challenge for AI systems. Recent approaches have used language models to predict a planning domain definition (e.g., PDDL) but have only been evaluated in closed-domain simulated environments. To address this, we present Proc2PDDL , the first dataset containing open-domain procedural texts paired with expert-annotated PDDL representations. Using this dataset, we evaluate state-of-the-art models on defining the preconditions and effects of actions. We show that Proc2PDDL is highly challenging, with GPT-3.5's success rate close to 0% and GPT-4's around 35%. Our analysis shows both syntactic and semantic errors, indicating LMs' deficiency in both generating domain-specific prgorams and reasoning about events. We hope this analysis and dataset helps future progress towards integrating the best of LMs and formal planning.

View on arXiv PDF

Similar