Hard Constraints Meet Soft Generation: Guaranteed Feasibility for LLM-based Combinatorial Optimization
This addresses the critical issue of feasibility for real-world deployment of LLMs in combinatorial optimization, representing a novel method for a known bottleneck.
The paper tackled the problem of ensuring solution feasibility in LLM-based combinatorial optimization, introducing FALCON, a framework that achieved 100% feasibility while matching or exceeding state-of-the-art solvers in quality across seven NP-hard problems.
Large language models (LLMs) have emerged as promising general-purpose solvers for combinatorial optimization (CO), yet they fundamentally lack mechanisms to guarantee solution feasibility which is critical for real-world deployment. In this work, we introduce FALCON, a framework that ensures 100\% feasibility through three key innovations: (i) \emph{grammar-constrained decoding} enforces syntactic validity, (ii) a \emph{feasibility repair layer} corrects semantic constraint violations, and (iii) \emph{adaptive Best-of-$N$ sampling} allocates inference compute efficiently. To train the underlying LLM, we introduce the Best-anchored Objective-guided Preference Optimization (BOPO) in LLM training, which weights preference pairs by their objective gap, providing dense supervision without human labels. Theoretically, we prove convergence for BOPO and provide bounds on repair-induced quality loss. Empirically, across seven NP-hard CO problems, FALCON achieves perfect feasibility while matching or exceeding the solution quality of state-of-the-art neural and LLM-based solvers.