Enhancing Column Generation by Reinforcement Learning-Based Hyper-Heuristic for Vehicle Routing and Scheduling Problems
This work addresses the inefficiency of existing heuristics in column generation for combinatorial optimization problems like vehicle routing and bus driver scheduling, offering a novel method that improves solution quality and convergence.
The paper tackles the challenge of accelerating column generation for large-scale vehicle routing and scheduling problems by proposing a reinforcement learning-based hyper-heuristic framework (RLHH) that selects low-level heuristics to construct reduced networks, resulting in cost reductions of up to 27.9% for VRPTW and 15.4% for BDSP compared to existing heuristics.
Column generation (CG) is a vital method to solve large-scale problems by dynamically generating variables. It has extensive applications in common combinatorial optimization, such as vehicle routing and scheduling problems, where each iteration step requires solving an NP-hard constrained shortest path problem. Although some heuristic methods for acceleration already exist, they are not versatile enough to solve different problems. In this work, we propose a reinforcement learning-based hyper-heuristic framework, dubbed RLHH, to enhance the performance of CG. RLHH is a selection module embedded in CG to accelerate convergence and get better integer solutions. In each CG iteration, the RL agent selects a low-level heuristic to construct a reduced network only containing the edges with a greater chance of being part of the optimal solution. In addition, we specify RLHH to solve two typical combinatorial optimization problems: Vehicle Routing Problem with Time Windows (VRPTW) and Bus Driver Scheduling Problem (BDSP). The total cost can be reduced by up to 27.9\% in VRPTW and 15.4\% in BDSP compared to the best lower-level heuristic in our tested scenarios, within equivalent or even less computational time. The proposed RLHH is the first RL-based CG method that outperforms traditional approaches in terms of solution quality, which can promote the application of CG in combinatorial optimization.