XML Prompting as Grammar-Constrained Interaction: Fixed-Point Semantics, Convergence Guarantees, and Human-AI Protocols
This work addresses the challenge of ensuring reliable and structured outputs from LLMs in real-world systems, offering a formal framework for human-AI interaction protocols, but it is incremental as it builds on existing methods like grammar-aligned decoding.
The paper tackled the problem of steering large language models to produce parseable, schema-adherent outputs using XML prompting, by developing a logic-first treatment that unifies grammar-constrained decoding, fixed-point semantics, and convergent human-AI interaction loops, with results including proofs of least fixed points and convergence guarantees for iterative guidance.
Structured prompting with XML tags has emerged as an effective way to steer large language models (LLMs) toward parseable, schema-adherent outputs in real-world systems. We develop a logic-first treatment of XML prompting that unifies (i) grammar-constrained decoding, (ii) fixed-point semantics over lattices of hierarchical prompts, and (iii) convergent human-AI interaction loops. We formalize a complete lattice of XML trees under a refinement order and prove that monotone prompt-to-prompt operators admit least fixed points (Knaster-Tarski) that characterize steady-state protocols; under a task-aware contraction metric on trees, we further prove Banach-style convergence of iterative guidance. We instantiate these results with context-free grammars (CFGs) for XML schemas and show how constrained decoding guarantees well-formedness while preserving task performance. A set of multi-layer human-AI interaction recipes demonstrates practical deployment patterns, including multi-pass "plan $\to$ verify $\to$ revise" routines and agentic tool use. We provide mathematically complete proofs and tie our framework to recent advances in grammar-aligned decoding, chain-of-verification, and programmatic prompting.