LGMar 10, 2021Code
Graph Neural Networks Inspired by Classical Iterative AlgorithmsYongyi Yang, Tang Liu, Yangkun Wang et al.
Despite the recent success of graph neural networks (GNN), common architectures often exhibit significant limitations, including sensitivity to oversmoothing, long-range dependencies, and spurious edges, e.g., as can occur as a result of graph heterophily or adversarial attacks. To at least partially address these issues within a simple transparent framework, we consider a new family of GNN layers designed to mimic and integrate the update rules of two classical iterative algorithms, namely, proximal gradient descent and iterative reweighted least squares (IRLS). The former defines an extensible base GNN architecture that is immune to oversmoothing while nonetheless capturing long-range dependencies by allowing arbitrary propagation steps. In contrast, the latter produces a novel attention mechanism that is explicitly anchored to an underlying end-to-end energy function, contributing stability with respect to edge uncertainty. When combined we obtain an extremely simple yet robust model that we evaluate across disparate scenarios including standardized benchmarks, adversarially-perturbated graphs, graphs with heterophily, and graphs involving long-range dependencies. In doing so, we compare against SOTA GNN approaches that have been explicitly designed for the respective task, achieving competitive or superior node classification accuracy. Our code is available at https://github.com/FFTYYY/TWIRLS.
LGOct 11, 2025
Enhancing the Cross-Size Generalization for Solving Vehicle Routing Problems via Continual LearningJingwen Li, Zhiguang Cao, Yaoxin Wu et al.
Exploring machine learning techniques for addressing vehicle routing problems has attracted considerable research attention. To achieve decent and efficient solutions, existing deep models for vehicle routing problems are typically trained and evaluated using instances of a single size. This substantially limits their ability to generalize across different problem sizes and thus hampers their practical applicability. To address the issue, we propose a continual learning based framework that sequentially trains a deep model with instances of ascending problem sizes. Specifically, on the one hand, we design an inter-task regularization scheme to retain the knowledge acquired from smaller problem sizes in the model training on a larger size. On the other hand, we introduce an intra-task regularization scheme to consolidate the model by imitating the latest desirable behaviors during training on each size. Additionally, we exploit the experience replay to revisit instances of formerly trained sizes for mitigating the catastrophic forgetting. Experimental results show that our approach achieves predominantly superior performance across various problem sizes (either seen or unseen in the training), as compared to state-of-the-art deep models including the ones specialized for generalizability enhancement. Meanwhile, the ablation studies on the key designs manifest their synergistic effect in the proposed framework.
LGNov 12, 2021
Implicit vs Unfolded Graph Neural NetworksYongyi Yang, Tang Liu, Yangkun Wang et al.
It has been observed that message-passing graph neural networks (GNN) sometimes struggle to maintain a healthy balance between the efficient/scalable modeling of long-range dependencies across nodes while avoiding unintended consequences such oversmoothed node representations, sensitivity to spurious edges, or inadequate model interpretability. To address these and other issues, two separate strategies have recently been proposed, namely implicit and unfolded GNNs (that we abbreviate to IGNN and UGNN respectively). The former treats node representations as the fixed points of a deep equilibrium model that can efficiently facilitate arbitrary implicit propagation across the graph with a fixed memory footprint. In contrast, the latter involves treating graph propagation as unfolded descent iterations as applied to some graph-regularized energy function. While motivated differently, in this paper we carefully quantify explicit situations where the solutions they produce are equivalent and others where their properties sharply diverge. This includes the analysis of convergence, representational capacity, and interpretability. In support of this analysis, we also provide empirical head-to-head comparisons across multiple synthetic and public real-world node classification benchmarks. These results indicate that while IGNN is substantially more memory-efficient, UGNN models support unique, integrated graph attention mechanisms and propagation rules that can achieve strong node classification accuracy across disparate regimes such as adversarially-perturbed graphs, graphs with heterophily, and graphs involving long-range dependencies.
LGJun 9, 2021
Scaling Up Graph Neural Networks Via Graph CoarseningZengfeng Huang, Shengzhong Zhang, Chong Xi et al.
Scalability of graph neural networks remains one of the major challenges in graph machine learning. Since the representation of a node is computed by recursively aggregating and transforming representation vectors of its neighboring nodes from previous layers, the receptive fields grow exponentially, which makes standard stochastic optimization techniques ineffective. Various approaches have been proposed to alleviate this issue, e.g., sampling-based methods and techniques based on pre-computation of graph filters. In this paper, we take a different approach and propose to use graph coarsening for scalable training of GNNs, which is generic, extremely simple and has sublinear memory and time costs during training. We present extensive theoretical analysis on the effect of using coarsening operations and provides useful guidance on the choice of coarsening methods. Interestingly, our theoretical analysis shows that coarsening can also be considered as a type of regularization and may improve the generalization. Finally, empirical results on real world datasets show that, simply applying off-the-shelf coarsening methods, we can reduce the number of nodes by up to a factor of ten without causing a noticeable downgrade in classification accuracy.