LGApr 1, 2025

Adversarial Curriculum Graph-Free Knowledge Distillation for Graph Neural Networks

Yuang Jia, Xiaojuan Shan, Jun Xia, Guancheng Wan, Yuchen Zhang, Wenke Huang, Mang Ye, Stan Z. Li

arXiv:2504.00540v24.1h-index: 19

Originality Incremental advance

AI Analysis

This work addresses a domain-specific problem for graph machine learning practitioners, offering an incremental improvement over existing methods by adapting techniques from vision to graphs.

The paper tackles the problem of data-free knowledge distillation for graph neural networks, which is challenging due to graph data's varying topological structures, and proposes ACGKD, a method that reduces spatial complexity and accelerates distillation while achieving state-of-the-art performance.

Data-free Knowledge Distillation (DFKD) is a method that constructs pseudo-samples using a generator without real data, and transfers knowledge from a teacher model to a student by enforcing the student to overcome dimensional differences and learn to mimic the teacher's outputs on these pseudo-samples. In recent years, various studies in the vision domain have made notable advancements in this area. However, the varying topological structures and non-grid nature of graph data render the methods from the vision domain ineffective. Building upon prior research into differentiable methods for graph neural networks, we propose a fast and high-quality data-free knowledge distillation approach in this paper. Without compromising distillation quality, the proposed graph-free KD method (ACGKD) significantly reduces the spatial complexity of pseudo-graphs by leveraging the Binary Concrete distribution to model the graph structure and introducing a spatial complexity tuning parameter. This approach enables efficient gradient computation for the graph structure, thereby accelerating the overall distillation process. Additionally, ACGKD eliminates the dimensional ambiguity between the student and teacher models by increasing the student's dimensions and reusing the teacher's classifier. Moreover, it equips graph knowledge distillation with a CL-based strategy to ensure the student learns graph structures progressively. Extensive experiments demonstrate that ACGKD achieves state-of-the-art performance in distilling knowledge from GNNs without training data.

View on arXiv PDF

Similar