Landscape-Aware Bandit Hyper-Heuristics for Online Operator Selection in UAV Inspection Routing
For researchers and practitioners in UAV routing and combinatorial optimization, this work provides an adaptive operator selection method that improves solution quality over existing hyper-heuristics, though the improvement is incremental.
This paper introduces LA-BHH, a landscape-aware bandit hyper-heuristic for online operator selection in UAV inspection routing, which learns to select among 2-opt, swap, relocate, and Or-opt moves using LinUCB. On 45 Euclidean TSP instances, LA-BHH achieves a mean final gap of 0.0223 and convergence AUC of 0.0389, reducing final gap by 17.6% over UCB-HH and 22.6% over Random-HH.
UAV multi-site inspection often reduces to choosing a high-quality visiting order after target sites have been extracted from a map. This paper develops LA-BHH, a landscape-aware bandit hyper-heuristic that learns an operator-selection policy online for this routing layer. LA-BHH treats 2-opt, swap, relocate, and Or-opt moves as low-level arms, builds context from static landscape descriptors and online search-state features, and updates a LinUCB controller from improvement rewards during the same run. Experimental results on 45 generated Euclidean TSP instances show that LA-BHH achieves the best mean final gap and convergence AUC, with 0.0223 and 0.0389 respectively. It reduces final gap by 17.6\% over UCB-HH, 22.6\% over Random-HH, and 68.2\% over nearest-neighbor construction. Ablation results further show that contextual credit assignment, 2-opt repair, and stagnation-aware state use are the main contributors.