CR LGFeb 27, 2025

ADAGE: Active Defenses Against GNN Extraction

Jing Xu, Franziska Boenisch, Adam Dziedzic

arXiv:2503.00065v36.41 citationsh-index: 12

Originality Highly original

AI Analysis

This addresses the security issue of protecting valuable GNN models from theft, which is crucial for applications like drug discovery and recommendation systems, and it is a novel defense rather than incremental.

The paper tackles the problem of model stealing attacks on Graph Neural Networks (GNNs) by proposing ADAGE, an active defense that prevents extraction across all common attack setups, rendering stealing impossible while preserving predictive performance.

Graph Neural Networks (GNNs) achieve high performance in various real-world applications, such as drug discovery, traffic states prediction, and recommendation systems. The fact that building powerful GNNs requires a large amount of training data, powerful computing resources, and human expertise turns the models into lucrative targets for model stealing attacks. Prior work has revealed that the threat vector of stealing attacks against GNNs is large and diverse, as an attacker can leverage various heterogeneous signals ranging from node labels to high-dimensional node embeddings to create a local copy of the target GNN at a fraction of the original training costs. This diversity in the threat vector renders the design of effective and general defenses challenging and existing defenses usually focus on one particular stealing setup. Additionally, they solely provide means to identify stolen model copies rather than preventing the attack. To close this gap, we propose the first and general Active Defense Against GNN Extraction (ADAGE). ADAGE builds on the observation that stealing a model's full functionality requires highly diverse queries to leak its behavior across the input space. Our defense monitors this query diversity and progressively perturbs outputs as the accumulated leakage grows. In contrast to prior work, ADAGE can prevent stealing across all common attack setups. Our extensive experimental evaluation using six benchmark datasets, four GNN models, and three types of adaptive attackers shows that ADAGE penalizes attackers to the degree of rendering stealing impossible, whilst preserving predictive performance on downstream tasks. ADAGE, thereby, contributes towards securely sharing valuable GNNs in the future.

View on arXiv PDF

Similar