LG AI CR DCMay 7, 2024

Federated Graph Condensation with Information Bottleneck Principles

Bo Yan, Sihao He, Cheng Yang, Shang Liu, Yang Cao, Chuan Shi

arXiv:2405.03911v413 citationsh-index: 1AAAI

Originality Incremental advance

AI Analysis

This addresses privacy and data distribution challenges in graph learning for decentralized applications, though it is incremental by adapting existing techniques to a federated setting.

The paper tackles the problem of graph condensation in decentralized settings by proposing federated graph condensation (FGC) to synthesize small-scale graphs while preserving privacy, achieving comparable or superior performance to centralized methods and protecting against membership inference attacks.

Graph condensation (GC), which reduces the size of a large-scale graph by synthesizing a small-scale condensed graph as its substitution, has benefited various graph learning tasks. However, existing GC methods rely on centralized data storage, which is unfeasible for real-world decentralized data distribution, and overlook data holders' privacy-preserving requirements. To bridge this gap, we propose and study the novel problem of federated graph condensation (FGC) for graph neural networks (GNNs). Specifically, we first propose a general framework for FGC, where we decouple the typical gradient matching process for GC into client-side gradient calculation and server-side gradient matching, integrating knowledge from multiple clients' subgraphs into one smaller condensed graph. Nevertheless, our empirical studies show that under the federated setting, the condensed graph will consistently leak data membership privacy, i.e., the condensed graph during federated training can be utilized to steal training data under the membership inference attack (MIA). To tackle this issue, we innovatively incorporate information bottleneck principles into the FGC, which only needs to extract partial node features in one local pre-training step and utilize the features during federated training. Theoretical and experimental analyses demonstrate that our framework consistently protects membership privacy during training. Meanwhile, it can achieve comparable and even superior performance against existing centralized GC and federated graph learning (FGL) methods.

View on arXiv PDF

Similar