CLFeb 28, 2025

Test-Time Alignment for Large Language Models via Textual Model Predictive Control

Kuang-Da Wang, Teng-Ruei Chen, Yu Heng Hung, Guo-Xun Ko, Shuoyang Ding, Yueh-Hua Wu, Yu-Chiang Frank Wang, Chao-Han Huck Yang, Wen-Chih Peng, Ping-Chun Hsieh

arXiv:2502.20795v36.72 citationsh-index: 4

Originality Incremental advance

AI Analysis

This work addresses the resource-intensive nature of finetuning for LLM alignment, offering a lightweight alternative that could benefit developers and researchers in NLP, though it appears incremental as it builds on existing control theory and hierarchical reinforcement learning concepts.

The paper tackles the problem of aligning Large Language Models with human preferences at test time by proposing Textual Model Predictive Control (TMPC), a predictive planning framework that overcomes challenges like the curse of horizon and dimensionality, resulting in consistent performance improvements across tasks such as discourse-level translation, long-form response generation, and program synthesis.

Aligning Large Language Models (LLMs) with human preferences through finetuning is resource-intensive, motivating lightweight alternatives at test time. We address test-time alignment through the lens of sequential decision making, a perspective that reveals two fundamental challenges. When actions are defined at the token level, as in guided decoding, alignment suffers from the curse of horizon. Conversely, when actions are at the response level, as in traditional iterative refinement, the curse of dimensionality emerges. To resolve this trade-off, we draw inspiration from Model Predictive Control (MPC) in control theory to propose Textual Model Predictive Control (TMPC), a novel predictive planning framework adapted for aligning LLMs at inference time. A key limitation of standard MPC is its reliance on predefined, hard segment boundaries, which are often absent in text generation. TMPC overcomes this by introducing two principles inspired by hierarchical reinforcement learning: (1) Hindsight Subgoal Identification, where TMPC analyzes generation subgoals to retrospectively identify high-reward intermediate outputs as subgoals. This allows the framework to discover meaningful, task-specific planning steps (e.g., a sentence in machine translation or a bug fix in code generation.). (2) Subgoal-Conditioned Re-Generation, where these identified subgoals are used to guide subsequent planning iterations. By conditioning on these proven, high-quality subgoals, TMPC ensures stable improvement by building upon previously validated successes. TMPC is evaluated on three tasks with distinct segmentation properties: discourse-level translation, long-form response generation, and program synthesis. The results demonstrate that TMPC consistently improves performance, highlighting the generality.

View on arXiv PDF

Similar