Efficient Algorithms for Computing Random Walk Centrality
This work addresses a computational bottleneck for researchers and practitioners in graph mining who need to analyze node importance in large-scale networks, representing an incremental improvement over existing methods.
The paper tackled the problem of efficiently computing random walk centrality for large networks, which was previously impractical due to computational demands, and resulted in two scalable algorithms that operate in near-linear time with strong approximation guarantees, as validated on networks with over 10 million nodes.
Random walk centrality is a fundamental metric in graph mining for quantifying node importance and influence, defined as the weighted average of hitting times to a node from all other nodes. Despite its ability to capture rich graph structural information and its wide range of applications, computing this measure for large networks remains impractical due to the computational demands of existing methods. In this paper, we present a novel formulation of random walk centrality, underpinning two scalable algorithms: one leveraging approximate Cholesky factorization and sparse inverse estimation, while the other sampling rooted spanning trees. Both algorithms operate in near-linear time and provide strong approximation guarantees. Extensive experiments on large real-world networks, including one with over 10 million nodes, demonstrate the efficiency and approximation quality of the proposed algorithms.