Hermitian matrices for clustering directed graphs: insights and applications
This work addresses the challenge of clustering directed graphs for applications like migration analysis, offering a novel method that captures directional structures, though it is incremental in improving existing spectral techniques.
The authors tackled the problem of spectral clustering for directed graphs, which often fails to capture directional edge information, by proposing a complex-valued matrix representation and demonstrated its ability to cluster U.S. migration data by socio-economic profiles rather than just geographic proximity.
Graph clustering is a basic technique in machine learning, and has widespread applications in different domains. While spectral techniques have been successfully applied for clustering undirected graphs, the performance of spectral clustering algorithms for directed graphs (digraphs) is not in general satisfactory: these algorithms usually require symmetrising the matrix representing a digraph, and typical objective functions for undirected graph clustering do not capture cluster-structures in which the information given by the direction of the edges is crucial. To overcome these downsides, we propose a spectral clustering algorithm based on a complex-valued matrix representation of digraphs. We analyse its theoretical performance on a Stochastic Block Model for digraphs in which the cluster-structure is given not only by variations in edge densities, but also by the direction of the edges. The significance of our work is highlighted on a data set pertaining to internal migration in the United States: while previous spectral clustering algorithms for digraphs can only reveal that people are more likely to move between counties that are geographically close, our approach is able to cluster together counties with a similar socio-economical profile even when they are geographically distant, and illustrates how people tend to move from rural to more urbanised areas.