Improved large-scale graph learning through ridge spectral sparsification
This addresses the challenge of fast and approximate learning in distributed graph systems, which is incremental as it builds on spectral sparsification methods.
The paper tackles the problem of learning over graph Laplacians in a distributed streaming setting, where edges are observed in real time, by introducing GSQUEAK, an algorithm that efficiently sparsifies the Laplacian with strong spectral approximation guarantees while processing edges in a single pass.
Graph-based techniques and spectral graph theory have enriched the field of machine learning with a variety of critical advances. A central object in the analysis is the graph Laplacian L, which encodes the structure of the graph. We consider the problem of learning over this Laplacian in a distributed streaming setting, where new edges of the graph are observed in real time by a network of workers. In this setting, it is hard to learn quickly or approximately while keeping a distributed representation of L. To address this challenge, we present a novel algorithm, GSQUEAK, which efficiently sparsifies the Laplacian by maintaining a small subset of effective resistances. We show that our algorithm produces sparsifiers with strong spectral approximation guarantees, all while processing edges in a single pass and in a distributed fashion.