GLASU: A Communication-Efficient Algorithm for Federated Learning with Vertically Distributed Graph Data
This work addresses an emerging scenario in federated learning for graph data, offering a practical solution to communication bottlenecks, though it is incremental as it builds on existing VFL and GNN methods.
The paper tackles the problem of high communication overhead in vertical federated learning with graph-structured data by proposing GLASU, a communication-efficient algorithm that splits a GNN model across clients and a server, using lazy aggregation and stale updates to reduce communication while achieving performance matching centralized training on real-world datasets.
Vertical federated learning (VFL) is a distributed learning paradigm, where computing clients collectively train a model based on the partial features of the same set of samples they possess. Current research on VFL focuses on the case when samples are independent, but it rarely addresses an emerging scenario when samples are interrelated through a graph. For graph-structured data, graph neural networks (GNNs) are competitive machine learning models, but a naive implementation in the VFL setting causes a significant communication overhead. Moreover, the analysis of the training is faced with a challenge caused by the biased stochastic gradients. In this paper, we propose a model splitting method that splits a backbone GNN across the clients and the server and a communication-efficient algorithm, GLASU, to train such a model. GLASU adopts lazy aggregation and stale updates to skip aggregation when evaluating the model and skip feature exchanges during training, greatly reducing communication. We offer a theoretical analysis and conduct extensive numerical experiments on real-world datasets, showing that the proposed algorithm effectively trains a GNN model, whose performance matches that of the backbone GNN when trained in a centralized manner.