Learning to Solve Combinatorial Graph Partitioning Problems via Efficient Exploration
This addresses scalability bottlenecks in RL for combinatorial optimization, which is important for applications in logistics and natural sciences, though it appears incremental as it builds on existing RL and GNN methods.
The paper tackles the scalability issues of reinforcement learning approaches for combinatorial graph partitioning problems by introducing ECORD, which restricts expensive graph neural networks to a single pre-processing step. ECORD achieves a new state-of-the-art for RL algorithms on the Maximum Cut problem, reducing the optimality gap by up to 73% on 500-vertex graphs while improving speed and scalability.
From logistics to the natural sciences, combinatorial optimisation on graphs underpins numerous real-world applications. Reinforcement learning (RL) has shown particular promise in this setting as it can adapt to specific problem structures and does not require pre-solved instances for these, often NP-hard, problems. However, state-of-the-art (SOTA) approaches typically suffer from severe scalability issues, primarily due to their reliance on expensive graph neural networks (GNNs) at each decision step. We introduce ECORD; a novel RL algorithm that alleviates this expense by restricting the GNN to a single pre-processing step, before entering a fast-acting exploratory phase directed by a recurrent unit. Experimentally, ECORD achieves a new SOTA for RL algorithms on the Maximum Cut problem, whilst also providing orders of magnitude improvement in speed and scalability. Compared to the nearest competitor, ECORD reduces the optimality gap by up to 73% on 500 vertex graphs with a decreased wall-clock time. Moreover, ECORD retains strong performance when generalising to larger graphs with up to 10000 vertices.