SI IRJul 3, 2019

On the Privacy of dK-Random Graphs

Sameera Horawalavithana, Adriana Iamnitchi

arXiv:1907.01695v11.23 citations

Originality Incremental advance

AI Analysis

It addresses privacy risks in sharing graph datasets for researchers and practitioners, but is incremental as it builds on existing anonymization and attack methods.

This paper investigates why real social network graphs are vulnerable to de-anonymization attacks, finding that structural properties like node subsets and dK-series effectiveness determine attack success, with quantified results on vulnerability factors.

Real social network datasets provide significant benefits for understanding phenomena such as information diffusion or network evolution. Yet the privacy risks raised from sharing real graph datasets, even when stripped of user identity information, are significant. Previous research shows that many graph anonymization techniques fail against existing graph de-anonymization attacks. However, the specific reason for the success of such de-anonymization attacks is yet to be understood. This paper systematically studies the structural properties of real graphs that make them more vulnerable to machine learning-based techniques for de-anonymization. More precisely, we study the boundaries of anonymity based on the structural properties of real graph datasets in terms of how their dK-based anonymized versions resist (or fail) to various types of attacks. Our experimental results lead to three contributions. First, we identify the strength of an attacker based on the graph characteristics of the subset of nodes from which it starts the de-anonymization attack. Second, we quantify the relative effectiveness of dK-series for graph anonymization. And third, we identify the properties of the original graph that make it more vulnerable to de-anonymization.

View on arXiv PDF

Similar