LGOct 28, 2021

A Game-Theoretic Approach for Improving Generalization Ability of TSP Solvers

Chenguang Wang, Yaodong Yang, Oliver Slumbers, Congying Han, Tiande Guo, Haifeng Zhang, Jun Wang

arXiv:2110.15105v39.920 citations

Originality Incremental advance

AI Analysis

This addresses overfitting in TSP solvers for combinatorial optimization, though it is an incremental application of game theory to a known bottleneck.

The paper tackles the problem of improving generalization in deep learning-based Traveling Salesman Problem (TSP) solvers by introducing a two-player zero-sum game between a Solver and a Data Generator, resulting in state-of-the-art performance on unseen TSP tasks where other solvers overfit.

In this paper, we introduce a two-player zero-sum framework between a trainable \emph{Solver} and a \emph{Data Generator} to improve the generalization ability of deep learning-based solvers for Traveling Salesman Problem (TSP). Grounded in \textsl{Policy Space Response Oracle} (PSRO) methods, our two-player framework outputs a population of best-responding Solvers, over which we can mix and output a combined model that achieves the least exploitability against the Generator, and thereby the most generalizable performance on different TSP tasks. We conduct experiments on a variety of TSP instances with different types and sizes. Results suggest that our Solvers achieve the state-of-the-art performance even on tasks the Solver never meets, whilst the performance of other deep learning-based Solvers drops sharply due to over-fitting. To demonstrate the principle of our framework, we study the learning outcome of the proposed two-player game and demonstrate that the exploitability of the Solver population decreases during training, and it eventually approximates the Nash equilibrium along with the Generator.

View on arXiv PDF

Similar