AI LGDec 10, 2025

An End-to-end Planning Framework with Agentic LLMs and PDDL

Emanuele La Malfa, Ping Zhu, Samuele Marro, Sara Bernardini, Michael Wooldridge

arXiv:2512.09629v17.82 citationsh-index: 11

Originality Highly original

AI Analysis

This addresses the challenge of automating planning from human specifications for researchers and practitioners in AI planning, though it is incremental as it builds on existing PDDL and LLM methods.

The authors tackled the problem of converting natural language specifications into executable plans by developing an end-to-end framework that uses LLMs to generate and refine PDDL models, then passes them to external planning engines. They demonstrated effectiveness on benchmarks like Google NaturalPlan and PlanBench, handling tasks where LLMs typically struggle, such as Blocksworld and Tower of Hanoi.

We present an end-to-end framework for planning supported by verifiers. An orchestrator receives a human specification written in natural language and converts it into a PDDL (Planning Domain Definition Language) model, where the domain and problem are iteratively refined by sub-modules (agents) to address common planning requirements, such as time constraints and optimality, as well as ambiguities and contradictions that may exist in the human specification. The validated domain and problem are then passed to an external planning engine to generate a plan. The orchestrator and agents are powered by Large Language Models (LLMs) and require no human intervention at any stage of the process. Finally, a module translates the final plan back into natural language to improve human readability while maintaining the correctness of each step. We demonstrate the flexibility and effectiveness of our framework across various domains and tasks, including the Google NaturalPlan benchmark and PlanBench, as well as planning problems like Blocksworld and the Tower of Hanoi (where LLMs are known to struggle even with small instances). Our framework can be integrated with any PDDL planning engine and validator (such as Fast Downward, LPG, POPF, VAL, and uVAL, which we have tested) and represents a significant step toward end-to-end planning aided by LLMs.

View on arXiv PDF

Similar