SI AI CR LGFeb 28, 2024

GraphPub: Generation of Differential Privacy Graph with High Availability

Wanghan Xu, Bin Shi, Ao Liu, Jiqiang Zhang, Bo Dong

arXiv:2403.00030v21.23 citationsh-index: 9

Originality Incremental advance

AI Analysis

This addresses privacy concerns for data owners publishing sensitive graph data, such as social networks, by enabling differential privacy without significantly degrading GNN performance, though it is an incremental improvement over existing methods.

The paper tackles the problem of applying differential privacy to graph data for GNN tasks, which often reduces model accuracy due to complex graph topology, and proposes GraphPub, a framework that modifies graphs by replacing real edges with false ones to protect privacy while maintaining high data availability, achieving model accuracy close to the original graph with an extremely low privacy budget.

In recent years, with the rapid development of graph neural networks (GNN), more and more graph datasets have been published for GNN tasks. However, when an upstream data owner publishes graph data, there are often many privacy concerns, because many real-world graph data contain sensitive information like person's friend list. Differential privacy (DP) is a common method to protect privacy, but due to the complex topological structure of graph data, applying DP on graphs often affects the message passing and aggregation of GNN models, leading to a decrease in model accuracy. In this paper, we propose a novel graph edge protection framework, graph publisher (GraphPub), which can protect graph topology while ensuring that the availability of data is basically unchanged. Through reverse learning and the encoder-decoder mechanism, we search for some false edges that do not have a large negative impact on the aggregation of node features, and use them to replace some real edges. The modified graph will be published, which is difficult to distinguish between real and false data. Sufficient experiments prove that our framework achieves model accuracy close to the original graph with an extremely low privacy budget.

View on arXiv PDF

Similar