Parametrized Power-Iteration Clustering for Directed Graphs
This addresses the problem of clustering directed graphs for researchers and practitioners, offering an incremental improvement over existing methods.
The paper tackles vertex-level clustering for directed graphs by proposing Parametrized Power-Iteration Clustering (ParPIC), a method that achieves competitive clustering accuracy with improved scalability compared to spectral and teleportation-based approaches.
Vertex-level clustering for directed graphs (digraphs) remains challenging as edge directionality breaks the key assumptions underlying popular spectral methods, which also incur the overhead of eigen-decomposition. This paper proposes Parametrized Power-Iteration Clustering (ParPIC), a random-walk-based clustering method for weakly connected digraphs. This builds over the Power-Iteration Clustering paradigm, which uses the rows of the iterated diffusion operator as a data embedding. ParPIC has three important features: the use of parametrized reversible random walk operators, the automatic tuning of the diffusion time, and the efficient truncation of the final embedding, which produces low-dimensional data representations and reduces complexity. Empirical results on synthetic and real-world graphs demonstrate that ParPIC achieves competitive clustering accuracy with improved scalability relative to spectral and teleportation-based methods.