On Learning Paradigms for the Travelling Salesman Problem
This addresses the challenge of scaling neural solvers for combinatorial optimization problems like TSP, though it appears incremental in comparing established paradigms.
The paper investigated how different learning paradigms affect deep neural networks for solving the Travelling Salesman Problem, finding that reinforcement learning outperforms supervised learning in generalizing to variable graph sizes up to 500 nodes without labeled data.
We explore the impact of learning paradigms on training deep neural networks for the Travelling Salesman Problem. We design controlled experiments to train supervised learning (SL) and reinforcement learning (RL) models on fixed graph sizes up to 100 nodes, and evaluate them on variable sized graphs up to 500 nodes. Beyond not needing labelled data, our results reveal favorable properties of RL over SL: RL training leads to better emergent generalization to variable graph sizes and is a key component for learning scale-invariant solvers for novel combinatorial problems.