LGApr 22

Spectral Embeddings Leak Graph Topology: Theory, Benchmark, and Adaptive Reconstruction

Thinh Nguyen-Cong, Truong-Son Hy, Thang N. Dinh

arXiv:2604.2109443.4h-index: 29Has Code

AI Analysis

For practitioners in federated graph learning and privacy-sensitive applications, this work provides a benchmark and a method for reconstructing graphs from noisy spectral fragments, though the approach is incremental.

The paper addresses the challenge of graph learning when graph data is localized, fragmented, and noisy, introducing LoGraB benchmark and AFR reconstruction method. AFR achieves best F1 on 7/9 datasets and retains 75% of undefended F1 under differential privacy at ε=2.

Graph Neural Networks (GNNs) excel on relational data, but standard benchmarks unrealistically assume the graph is centrally available. In practice, settings such as Federated Graph Learning, distributed systems, and privacy-sensitive applications involve graph data that are localized, fragmented, noisy, and privacy-leaking. We present a unified framework for this setting. We introduce LoGraB (Local Graph Benchmark), which decomposes standard datasets into fragmented benchmarks using three strategies and four controls: neighborhood radius $d$, spectral quality $k$, noise level $σ$, and coverage ratio $p$. LoGraB supports graph reconstruction, localized node classification, and inter-fragment link prediction, with Island Cohesion. We propose AFR (Adaptive Fidelity-driven Reconstruction), a method for noisy spectral fragments. AFR scores patch quality via a fidelity measure combining a gap-to-truncation stability ratio and structural entropy, then assembles fragments using RANSAC-Procrustes alignment, adaptive stitching, and Bundle Adjustment. Rather than forcing a single global graph, AFR recovers large faithful islands. We prove heat-kernel edge recovery under a separation condition, Davis--Kahan perturbation stability, and bounded alignment error. We establish a Spectral Leakage Proposition: under a spectral-gap assumption, polynomial-time Bayesian recovery is feasible once enough eigenvectors are shared, complementing AFR's deterministic guarantees. Experiments on nine benchmarks show that LoGraB reveals model strengths and weaknesses under fragmentation, AFR achieves the best F1 on 7/9 datasets, and under per-embedding $(ε,δ)$-Gaussian differential privacy, AFR retains 75% of its undefended F1 at $ε=2$. Our anonymous code is available at https://anonymous.4open.science/r/JMLR_submission

View on arXiv PDF

Similar