CLJun 19, 2024

Adaptable Logical Control for Large Language Models

Honghua Zhang, Po-Nien Kung, Masahiro Yoshida, Guy Van den Broeck, Nanyun Peng

arXiv:2406.13892v213.233 citationsHas Code

Originality Highly original

AI Analysis

This addresses the problem of reliable logical control in LLM generation for users needing precise outputs, representing a strong specific gain rather than a foundational breakthrough.

The paper tackles the challenge of controlling large language model generation to follow logical constraints by introducing Ctrl-G, a framework combining an LLM with a Hidden Markov Model, which outperforms GPT4 by over 30% in human evaluation for text editing tasks.

Despite the success of Large Language Models (LLMs) on various tasks following human instructions, controlling model generation at inference time poses a persistent challenge. In this paper, we introduce Ctrl-G, an adaptable framework that facilitates tractable and flexible control of LLM generation to reliably follow logical constraints. Ctrl-G combines any production-ready LLM with a Hidden Markov Model, enabling LLM outputs to adhere to logical constraints represented as deterministic finite automata. We show that Ctrl-G, when applied to a TULU2-7B model, outperforms GPT3.5 and GPT4 on the task of interactive text editing: specifically, for the task of generating text insertions/continuations following logical constraints, Ctrl-G achieves over 30% higher satisfaction rate in human evaluation compared to GPT4. When applied to medium-size language models (e.g., GPT2-large), Ctrl-G also beats its counterparts for constrained generation by large margins on standard benchmarks. Additionally, as a proof-of-concept study, we experiment Ctrl-G on the Grade School Math benchmark to assist LLM reasoning, foreshadowing the application of Ctrl-G, as well as other constrained generation approaches, beyond traditional language generation tasks.

View on arXiv PDF Code

Similar