CLDec 13, 2024

On the Limit of Language Models as Planning Formalizers

arXiv:2412.09879v416 citationsh-index: 2ACL
Originality Incremental advance
AI Analysis

This work addresses the challenge of making LLM-generated plans executable and verifiable in grounded environments, though it is incremental by extending prior methods to more realistic scenarios.

The paper tackles the problem of using Large Language Models (LLMs) as formalizers to generate complete Planning Domain Definition Language (PDDL) representations from natural environment descriptions, finding that most large models outperform direct plan generation and are robust to lexical perturbations, but performance decreases as descriptions become more natural-sounding.

Large Language Models have been found to create plans that are neither executable nor verifiable in grounded environments. An emerging line of work demonstrates success in using the LLM as a formalizer to generate a formal representation of the planning domain in some language, such as Planning Domain Definition Language (PDDL). This formal representation can be deterministically solved to find a plan. We systematically evaluate this methodology while bridging some major gaps. While previous work only generates a partial PDDL representation, given templated, and therefore unrealistic environment descriptions, we generate the complete representation given descriptions of various naturalness levels. Among an array of observations critical to improve LLMs' formal planning abilities, we note that most large enough models can effectively formalize descriptions as PDDL, outperforming those directly generating plans, while being robust to lexical perturbation. As the descriptions become more natural-sounding, we observe a decrease in performance and provide detailed error analysis.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes