Efficient Learning on Large Graphs using a Densifying Regularity Lemma
This addresses efficiency problems for researchers and practitioners working with large-scale graph data, offering a novel theoretical framework with practical applications.
The paper tackles the computational and memory challenges of learning on large graphs by introducing the Intersecting Block Graph (IBG), a low-rank factorization that approximates any graph with a dense representation whose rank depends only on accuracy, not sparsity. It demonstrates competitive performance on node classification, spatio-temporal analysis, and knowledge graph completion with linear complexity in nodes rather than edges.
Learning on large graphs presents significant challenges, with traditional Message Passing Neural Networks suffering from computational and memory costs scaling linearly with the number of edges. We introduce the Intersecting Block Graph (IBG), a low-rank factorization of large directed graphs based on combinations of intersecting bipartite components, each consisting of a pair of communities, for source and target nodes. By giving less weight to non-edges, we show how to efficiently approximate any graph, sparse or dense, by a dense IBG. Specifically, we prove a constructive version of the weak regularity lemma, showing that for any chosen accuracy, every graph, regardless of its size or sparsity, can be approximated by a dense IBG whose rank depends only on the accuracy. This dependence of the rank solely on the accuracy, and not on the sparsity level, is in contrast to previous forms of the weak regularity lemma. We present a graph neural network architecture operating on the IBG representation of the graph and demonstrating competitive performance on node classification, spatio-temporal graph analysis, and knowledge graph completion, while having memory and computational complexity linear in the number of nodes rather than edges.