Distilling Privileged Information for Dubins Traveling Salesman Problems with Neighborhoods
This work addresses the problem of efficiently planning tours for vehicles with motion constraints, such as drones or robots, in applications like surveillance or delivery, though it is incremental as it builds on existing heuristics and learning techniques.
The paper tackles the Dubins Traveling Salesman Problem with Neighborhoods (DTSPN) by developing a learning method that uses privileged information distillation and supervised adaptation to generate tours for non-holonomic vehicles, achieving solutions about 50 times faster than the LinKernighan heuristic (LKH) algorithm and outperforming other imitation learning and RL approaches.
This paper presents a novel learning approach for Dubins Traveling Salesman Problems(DTSP) with Neighborhood (DTSPN) to quickly produce a tour of a non-holonomic vehicle passing through neighborhoods of given task points. The method involves two learning phases: initially, a model-free reinforcement learning approach leverages privileged information to distill knowledge from expert trajectories generated by the LinKernighan heuristic (LKH) algorithm. Subsequently, a supervised learning phase trains an adaptation network to solve problems independently of privileged information. Before the first learning phase, a parameter initialization technique using the demonstration data was also devised to enhance training efficiency. The proposed learning method produces a solution about 50 times faster than LKH and substantially outperforms other imitation learning and RL with demonstration schemes, most of which fail to sense all the task points.