LGAIOct 14, 2023

Mirage: Model-Agnostic Graph Distillation for Graph Classification

arXiv:2310.09486v418 citationsh-index: 23
Originality Highly original
AI Analysis

This addresses the need for efficient GNN training in low-resource environments, offering a more robust and unsupervised approach compared to existing methods.

The paper tackles the problem of scaling GNN training on large datasets by proposing Mirage, a model-agnostic graph distillation algorithm for graph classification that compresses computation trees instead of emulating gradient flows, resulting in enhanced generalization accuracy, data compression, and distillation efficiency compared to state-of-the-art baselines.

GNNs, like other deep learning models, are data and computation hungry. There is a pressing need to scale training of GNNs on large datasets to enable their usage on low-resource environments. Graph distillation is an effort in that direction with the aim to construct a smaller synthetic training set from the original training data without significantly compromising model performance. While initial efforts are promising, this work is motivated by two key observations: (1) Existing graph distillation algorithms themselves rely on training with the full dataset, which undermines the very premise of graph distillation. (2) The distillation process is specific to the target GNN architecture and hyper-parameters and thus not robust to changes in the modeling pipeline. We circumvent these limitations by designing a distillation algorithm called Mirage for graph classification. Mirage is built on the insight that a message-passing GNN decomposes the input graph into a multiset of computation trees. Furthermore, the frequency distribution of computation trees is often skewed in nature, enabling us to condense this data into a concise distilled summary. By compressing the computation data itself, as opposed to emulating gradient flows on the original training set-a prevalent approach to date-Mirage transforms into an unsupervised and architecture-agnostic distillation algorithm. Extensive benchmarking on real-world datasets underscores Mirage's superiority, showcasing enhanced generalization accuracy, data compression, and distillation efficiency when compared to state-of-the-art baselines.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes