GraphIP-Bench: How Hard Is It to Steal a Graph Neural Network, and Can We Stop It?

Kaixiang Zhao, Bolin Shen, Yuyang Dai, Shayok Chakraborty, Yushun Dong

arXiv:2605.1282772.9Has Code

AI Analysis

Provides a standardized evaluation framework for the understudied problem of GNN model stealing, revealing that current defenses are largely ineffective.

GraphIP-Bench is a unified benchmark for evaluating GNN model extraction attacks and defenses, finding that stealing is easy at medium query budgets and most defenses fail to prevent it, with heterophilic graphs being harder to steal.

Graph neural networks (GNNs) deployed as cloud services can be \emph{stolen} through \emph{model-extraction attacks}, which train a surrogate from query responses to reproduce the target's behaviour, and a growing line of ownership defenses tries to prevent or trace such theft. The title of this paper asks two questions: \emph{how hard is it to steal a GNN?}, and \emph{can we stop it?} Prior work cannot answer either, because experiments use inconsistent datasets, threat models, and metrics. We introduce \emph{GraphIP-Bench}, a unified benchmark which evaluates both sides under a single black-box protocol. It integrates twelve extraction attacks, twelve defenses spanning watermarking, output-perturbation, and query-pattern-detection families, ten public graphs covering homophilic, heterophilic, and large-scale regimes, three GNN backbones, and three graph-learning tasks, and it reports fidelity, task utility, ownership verification, and computational cost on shared splits, queries, and budgets. We further add a joint attack-and-defense track which runs every attack on every defended target and measures watermark verification on the resulting surrogate, which exposes the protection that a defense retains after extraction. The empirical picture is short: stealing a GNN is easy at medium query budgets and most defenses do not change this; several watermarks verify reliably on the protected model but lose most of their verification signal on the extracted surrogate, which exposes a gap that single-model evaluations miss; and heterophilic graphs are systematically harder to steal, while a cross-architecture mismatch between target and surrogate reduces but does not prevent extraction. Code: \href{https://github.com/LabRAI/GraphIP-Bench}{LabRAI/GraphIP-Bench}.

View on arXiv PDF Code

Similar