Xuan Wu

h-index27

4papers

70citations

Novelty53%

AI Score39

Ranked #80,412 of 194,257 authors (top 41%)#231 in NE (top 22%)

4 Papers

13.2AIAug 1, 2023Code

Reinforcement Learning-based Non-Autoregressive Solver for Traveling Salesman Problems

Yubin Xiao, Di Wang, Boyang Li et al.

The Traveling Salesman Problem (TSP) is a well-known combinatorial optimization problem with broad real-world applications. Recently, neural networks have gained popularity in this research area because as shown in the literature, they provide strong heuristic solutions to TSPs. Compared to autoregressive neural approaches, non-autoregressive (NAR) networks exploit the inference parallelism to elevate inference speed but suffer from comparatively low solution quality. In this paper, we propose a novel NAR model named NAR4TSP, which incorporates a specially designed architecture and an enhanced reinforcement learning strategy. To the best of our knowledge, NAR4TSP is the first TSP solver that successfully combines RL and NAR networks. The key lies in the incorporation of NAR network output decoding into the training process. NAR4TSP efficiently represents TSP encoded information as rewards and seamlessly integrates it into reinforcement learning strategies, while maintaining consistent TSP sequence constraints during both training and testing phases. Experimental results on both synthetic and real-world TSPs demonstrate that NAR4TSP outperforms five state-of-the-art models in terms of solution quality, inference speed, and generalization to unseen scenarios.

9.4LGJun 2, 2025

Towards Efficient Few-shot Graph Neural Architecture Search via Partitioning Gradient Contribution

Wenhao Song, Xuan Wu, Bo Yang et al.

To address the weight coupling problem, certain studies introduced few-shot Neural Architecture Search (NAS) methods, which partition the supernet into multiple sub-supernets. However, these methods often suffer from computational inefficiency and tend to provide suboptimal partitioning schemes. To address this problem more effectively, we analyze the weight coupling problem from a novel perspective, which primarily stems from distinct modules in succeeding layers imposing conflicting gradient directions on the preceding layer modules. Based on this perspective, we propose the Gradient Contribution (GC) method that efficiently computes the cosine similarity of gradient directions among modules by decomposing the Vector-Jacobian Product during supernet backpropagation. Subsequently, the modules with conflicting gradient directions are allocated to distinct sub-supernets while similar ones are grouped together. To assess the advantages of GC and address the limitations of existing Graph Neural Architecture Search methods, which are limited to searching a single type of Graph Neural Networks (Message Passing Neural Networks (MPNNs) or Graph Transformers (GTs)), we propose the Unified Graph Neural Architecture Search (UGAS) framework, which explores optimal combinations of MPNNs and GTs. The experimental results demonstrate that GC achieves state-of-the-art (SOTA) performance in supernet partitioning quality and time efficiency. In addition, the architectures searched by UGAS+GC outperform both the manually designed GNNs and those obtained by existing NAS methods. Finally, ablation studies further demonstrate the effectiveness of all proposed methods.

7.3NEAug 25, 2021Code

Incorporating Surprisingly Popular Algorithm and Euclidean Distance-based Adaptive Topology into PSO

Xuan Wu, Jizong Han, Di Wang et al.

While many Particle Swarm Optimization (PSO) algorithms only use fitness to assess the performance of particles, in this work, we adopt Surprisingly Popular Algorithm (SPA) as a complementary metric in addition to fitness. Consequently, particles that are not widely known also have the opportunity to be selected as the learning exemplars. In addition, we propose a Euclidean distance-based adaptive topology to cooperate with SPA, where each particle only connects to k number of particles with the shortest Euclidean distance during each iteration. We also introduce the adaptive topology into heterogeneous populations to better solve large-scale problems. Specifically, the exploration sub-population better preserves the diversity of the population while the exploitation sub-population achieves fast convergence. Therefore, large-scale problems can be solved in a collaborative manner to elevate the overall performance. To evaluate the performance of our method, we conduct extensive experiments on various optimization problems, including three benchmark suites and two real-world optimization problems. The results demonstrate that our Euclidean distance-based adaptive topology outperforms the other widely adopted topologies and further suggest that our method performs significantly better than state-of-the-art PSO variants on small, medium, and large-scale problems.

3.0NEMar 12, 2021

Neural Architecture Search based on Cartesian Genetic Programming Coding Method

Xuan Wu, Linhan Jia, Xiuyi Zhang et al.

Neural architecture search (NAS) is a hot topic in the field of automated machine learning and outperforms humans in designing neural architectures on quite a few machine learning tasks. Motivated by the natural representation form of neural networks by the Cartesian genetic programming (CGP), we propose an evolutionary approach of NAS based on CGP, called CGPNAS, to solve sentence classification task. To evolve the architectures under the framework of CGP, the operations such as convolution are identified as the types of function nodes of CGP, and the evolutionary operations are designed based on Evolutionary Strategy. The experimental results show that the searched architectures are comparable with the performance of human-designed architectures. We verify the ability of domain transfer of our evolved architectures. The transfer experimental results show that the accuracy deterioration is lower than 2-5%. Finally, the ablation study identifies the Attention function as the single key function node and the linear transformations along could keep the accuracy similar with the full evolved architectures, which is worthy of investigation in the future.