LGAIMar 2, 2024

Teaching MLP More Graph Information: A Three-stage Multitask Knowledge Distillation Framework

arXiv:2403.01079v11 citationsh-index: 9
AI Analysis

This addresses efficiency issues for practitioners working with large graph datasets, though it appears incremental as it builds on existing knowledge distillation approaches.

The paper tackles the problem of high computational costs in Graph Neural Networks (GNNs) for large-scale graph inference by distilling graph knowledge into a more efficient Multi-Layer Perceptron (MLP), achieving improved performance and stability.

We study the challenging problem for inference tasks on large-scale graph datasets of Graph Neural Networks: huge time and memory consumption, and try to overcome it by reducing reliance on graph structure. Even though distilling graph knowledge to student MLP is an excellent idea, it faces two major problems of positional information loss and low generalization. To solve the problems, we propose a new three-stage multitask distillation framework. In detail, we use Positional Encoding to capture positional information. Also, we introduce Neural Heat Kernels responsible for graph data processing in GNN and utilize hidden layer outputs matching for better performance of student MLP's hidden layers. To the best of our knowledge, it is the first work to include hidden layer distillation for student MLP on graphs and to combine graph Positional Encoding with MLP. We test its performance and robustness with several settings and draw the conclusion that our work can outperform well with good stability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes