LG AINov 27, 2023

A Graph Neural Network-Based QUBO-Formulated Hamiltonian-Inspired Loss Function for Combinatorial Optimization using Reinforcement Learning

Redwan Ahmed Rizvee, Raheeb Hassan, Md. Mosaddek Khan

arXiv:2311.16277v12.01 citationsh-index: 4

Originality Incremental advance

AI Analysis

This work addresses scalability and constraint satisfaction issues in combinatorial optimization for researchers and practitioners, though it is incremental by building on existing PI-GNN and RL methods.

The paper tackles the performance gap of PI-GNN, a Graph Neural Network framework for combinatorial optimization, by integrating Reinforcement Learning with QUBO-formulated Hamiltonian as a generic reward function and introducing a Monty Carlo Tree Search strategy, achieving up to 44% reduction in constraint violations.

Quadratic Unconstrained Binary Optimization (QUBO) is a generic technique to model various NP-hard Combinatorial Optimization problems (CO) in the form of binary variables. Ising Hamiltonian is used to model the energy function of a system. QUBO to Ising Hamiltonian is regarded as a technique to solve various canonical optimization problems through quantum optimization algorithms. Recently, PI-GNN, a generic framework, has been proposed to address CO problems over graphs based on Graph Neural Network (GNN) architecture. They introduced a generic QUBO-formulated Hamiltonian-inspired loss function that was directly optimized using GNN. PI-GNN is highly scalable but there lies a noticeable decrease in the number of satisfied constraints when compared to problem-specific algorithms and becomes more pronounced with increased graph densities. Here, We identify a behavioral pattern related to it and devise strategies to improve its performance. Another group of literature uses Reinforcement learning (RL) to solve the aforementioned NP-hard problems using problem-specific reward functions. In this work, we also focus on creating a bridge between the RL-based solutions and the QUBO-formulated Hamiltonian. We formulate and empirically evaluate the compatibility of the QUBO-formulated Hamiltonian as the generic reward function in the RL-based paradigm in the form of rewards. Furthermore, we also introduce a novel Monty Carlo Tree Search-based strategy with GNN where we apply a guided search through manual perturbation of node labels during training. We empirically evaluated our methods and observed up to 44% improvement in the number of constraint violations compared to the PI-GNN.

View on arXiv PDF

Similar