DS LGMar 21, 2025

Fast online node labeling with graph subsampling

Yushen Huang, Ertai Luo, Reza Babenezhad, Yifan Sun

arXiv:2503.16755v21.2h-index: 3

Originality Incremental advance

AI Analysis

This work addresses scalability issues for large-scale graph applications, but it is incremental as it builds on existing APPR methods with a subsampling twist.

The paper tackles the problem of computational inefficiency in node labeling for large graphs by proposing an online subsampled APPR method that randomly drops messages, achieving approximation bounds with O(1/ε²) edges and O(nε) overhead.

Large data applications rely on storing data in massive, sparse graphs with millions to trillions of nodes. Graph-based methods, such as node prediction, aim for computational efficiency regardless of graph size. Techniques like localized approximate personalized page rank (APPR) solve sparse linear systems with complexity independent of graph size, but is in terms of the maximum node degree, which can be much larger in practice than the average node degree for real-world large graphs. In this paper, we consider an \emph{online subsampled APPR method}, where messages are intentionally dropped at random. We use tools from graph sparsifiers and matrix linear algebra to give approximation bounds on the graph's spectral properties ($O(1/ε^2)$ edges), and node classification performance (added $O(nε)$ overhead).

View on arXiv PDF

Similar