From random walks to distances on unweighted graphs
This addresses the need for robust metrics in network analysis, though it appears incremental as it builds on existing random walk and hitting time concepts.
The paper tackled the problem of defining similarity or dissimilarity between vertices in large unweighted directed graphs by introducing a metric based on Laplace transformed hitting time (LTHT), showing it is well-behaved and outperforms alternatives in tests on simulated and real-world data.
Large unweighted directed graphs are commonly used to capture relations between entities. A fundamental problem in the analysis of such networks is to properly define the similarity or dissimilarity between any two vertices. Despite the significance of this problem, statistical characterization of the proposed metrics has been limited. We introduce and develop a class of techniques for analyzing random walks on graphs using stochastic calculus. Using these techniques we generalize results on the degeneracy of hitting times and analyze a metric based on the Laplace transformed hitting time (LTHT). The metric serves as a natural, provably well-behaved alternative to the expected hitting time. We establish a general correspondence between hitting times of the Brownian motion and analogous hitting times on the graph. We show that the LTHT is consistent with respect to the underlying metric of a geometric graph, preserves clustering tendency, and remains robust against random addition of non-geometric edges. Tests on simulated and real-world data show that the LTHT matches theoretical predictions and outperforms alternatives.