72.8LGMay 24
Blocked Gibbs meets Diffusion Transformers: Unsupervised Learning for Constraint OptimizationYudong W. Xu, Wenhao Li, Xiaoyu Wang et al.
Diffusion models have shown promise in learning to solve constraint optimization problems. However, they are mostly restricted to problems with binary variables and rely on graph neural networks, hindering their application to a broader range of problems such as those with general discrete variables or constraint structures that necessitate global rather than local reasoning. We investigate the use of Diffusion Transformers to address the aforementioned limitations. A naive implementation performs poorly due to a fundamental mismatch between the standard diffusion process and constraint solving: while the former applies small, incremental denoising across all variables, the latter requires substantially altering specific subsets of variables to attain feasibility or optimality. Our method, Blocked Gibbs Diffusion Transformer (BloGDiT), is the first to address this limitation by replacing standard joint Gaussian denoising with blocked Gaussian denoising. BloGDiT uses iterative block resampling and anneals the block size over time to facilitate large, targeted edits within a block of variables. Across Sudoku, Graph Coloring, Maximum Independent Set, and MaxCut, BloGDiT matches or outperforms existing methods, demonstrating that blocked Gibbs-style diffusion provides a highly effective inductive bias for Transformer-based constraint satisfaction and optimization.
63.9LGMar 21
Large Neighborhood Search meets Iterative Neural Constraint HeuristicsYudong W. Xu, Wenhao Li, Scott Sanner et al. · utoronto
Neural networks are being increasingly used as heuristics for constraint satisfaction. These neural methods are often recurrent, learning to iteratively refine candidate assignments. In this work, we make explicit the connection between such iterative neural heuristics and Large Neighborhood Search (LNS), and adapt an existing neural constraint satisfaction method-ConsFormer-into an LNS procedure. We decompose the resulting neural LNS into two standard components: the destroy and repair operators. On the destroy side, we instantiate several classical heuristics and introduce novel prediction-guided operators that exploit the model's internal scores to select neighborhoods. On the repair side, we utilize ConsFormer as a neural repair operator and compare the original sampling-based decoder to a greedy decoder that selects the most likely assignments. Through an empirical study on Sudoku, Graph Coloring, and MaxCut, we find that adapting the neural heuristic to an LNS procedure yields substantial gains over its vanilla settings and improves its competitiveness with classical and neural baselines. We further observe consistent design patterns across tasks: stochastic destroy operators outperform greedy ones, while greedy repair is more effective than sampling-based repair for finding a single high-quality feasible assignment. These findings highlight LNS as a useful lens and design framework for structuring and improving iterative neural approaches.
LGFeb 18, 2025
Self-Supervised Transformers as Iterative Solution Improvers for Constraint SatisfactionYudong W. Xu, Wenhao Li, Scott Sanner et al. · utoronto
We present a Transformer-based framework for Constraint Satisfaction Problems (CSPs). CSPs find use in many applications and thus accelerating their solution with machine learning is of wide interest. Most existing approaches rely on supervised learning from feasible solutions or reinforcement learning, paradigms that require either feasible solutions to these NP-Complete CSPs or large training budgets and a complex expert-designed reward signal. To address these challenges, we propose ConsFormer, a self-supervised framework that leverages a Transformer as a solution refiner. ConsFormer constructs a solution to a CSP iteratively in a process that mimics local search. Instead of using feasible solutions as labeled data, we devise differentiable approximations to the discrete constraints of a CSP to guide model training. Our model is trained to improve random assignments for a single step but is deployed iteratively at test time, circumventing the bottlenecks of supervised and reinforcement learning. Experiments on Sudoku, Graph Coloring, Nurse Rostering, and MAXCUT demonstrate that our method can tackle out-of-distribution CSPs simply through additional iterations.