CLAILGMar 3, 2025

$\texttt{SEM-CTRL}$: Semantically Controlled Decoding

arXiv:2503.01804v2h-index: 14
Originality Highly original
AI Analysis

This addresses the problem of reliable LLM deployment for tasks requiring strict correctness, such as grammar synthesis and planning, though it appears incremental as it builds on existing constraint-based methods.

The paper tackles the challenge of ensuring syntactic and semantic correctness in LLM outputs by introducing SEM-CTRL, a unified approach that enforces constraints directly on the decoder, allowing small pre-trained LLMs to outperform larger variants and state-of-the-art models while guaranteeing solution correctness.

Ensuring both syntactic and semantic correctness in Large Language Model (LLM) outputs remains a significant challenge, despite being critical for real-world deployment. In this paper, we introduce $\texttt{SEM-CTRL}$, a unified approach that enforces rich context-sensitive constraints and task- and instance-specific semantics directly on an LLM decoder. Our approach integrates token-level MCTS, which is guided by specific syntactic and semantic constraints. The constraints over the desired outputs are expressed using Answer Set Grammars -- a logic-based formalism that generalizes context-sensitive grammars while incorporating background knowledge to represent task-specific semantics. We show that our approach guarantees correct completions for any off-the-shelf LLM without the need for fine-tuning. We evaluate $\texttt{SEM-CTRL}$ on a range of tasks, including synthetic grammar synthesis, combinatorial reasoning, and planning. Our results demonstrate that $\texttt{SEM-CTRL}$ allows small pre-trained LLMs to efficiently outperform larger variants and state-of-the-art reasoning models (e.g., o1-preview) while simultaneously guaranteeing solution correctness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes