TruncProof: A Guardrail for LLM-based JSON Generation under Token-Length Constraints
This work addresses the practical problem of token-length enforcement for LLM-generated JSON, which is critical for integration with external systems, but the solution is incremental as it builds on existing grammar-constrained methods.
TruncProof introduces a grammar-constrained generation method that ensures LLMs produce valid JSON within a predefined token limit, preventing infinite generation or system malfunctions. Experiments show it generates syntactically correct outputs under strict constraints and maintains semantic accuracy when combined with advanced decoding.
The LLM-based generation of machine-readable outputs such as JSON has attracted significant attention for integration with external systems. However, existing approaches cannot strictly enforce the maximum number of tokens to be generated, leading to infinite generation or truncated outputs that cause a system malfunction. To address this limitation, we propose TruncProof, a novel grammar-constrained generation method that enables LLMs to produce grammatically valid JSONs while adhering to a predefined token limit. By leveraging the properties of LL(1) parsers, TruncProof efficiently approximates the minimum number of tokens required to complete a grammatically valid output at each decoding step. Experiments on the Text-to-JSON instruction tasks demonstrate that TruncProof successfully generates syntactically correct outputs even under strict token constraints. Furthermore, we show that TruncProof can be effectively combined with advanced decoding strategies, resulting in outputs that are not only grammatically valid but also semantically accurate.