LGOct 22, 2025

A Markov Decision Process for Variable Selection in Branch & Bound

Paul Strang, Zacharie Alès, Côme Bissuel, Olivier Juan, Safia Kedad-Sidhoum, Emmanuel Rachelson

arXiv:2510.19348v17.11 citationsh-index: 19

Originality Incremental advance

AI Analysis

This work addresses a key bottleneck in MILP solvers for combinatorial optimization, offering an incremental improvement over existing RL-based methods.

The paper tackles the problem of variable selection in Branch and Bound (B&B) for Mixed-Integer Linear Programming by introducing BBMDP, a principled Markov Decision Process formulation, and shows that their branching agent outperforms prior state-of-the-art RL agents on four standard benchmarks.

Mixed-Integer Linear Programming (MILP) is a powerful framework used to address a wide range of NP-hard combinatorial optimization problems, often solved by Branch and Bound (B&B). A key factor influencing the performance of B&B solvers is the variable selection heuristic governing branching decisions. Recent contributions have sought to adapt reinforcement learning (RL) algorithms to the B&B setting to learn optimal branching policies, through Markov Decision Processes (MDP) inspired formulations, and ad hoc convergence theorems and algorithms. In this work, we introduce BBMDP, a principled vanilla MDP formulation for variable selection in B&B, allowing to leverage a broad range of RL algorithms for the purpose of learning optimal B\&B heuristics. Computational experiments validate our model empirically, as our branching agent outperforms prior state-of-the-art RL agents on four standard MILP benchmarks.

View on arXiv PDF

Similar