KCoreMotif: An Efficient Graph Clustering Algorithm for Large Networks by Exploiting k-core Decomposition and Motifs
This work addresses the problem of scaling graph clustering for large networks, which is incremental as it builds on existing spectral and motif-based methods.
The paper tackles the computational inefficiency of spectral clustering on large networks by proposing KCoreMotif, an algorithm that uses k-core decomposition and motifs to cluster large graphs efficiently. Results show it is accurate and efficient on 18 real-world datasets compared to baseline methods.
Clustering analysis has been widely used in trust evaluation on various complex networks such as wireless sensors networks and online social networks. Spectral clustering is one of the most commonly used algorithms for graph-structured data (networks). However, the conventional spectral clustering is inherently difficult to work with large-scale networks due to the fact that it needs computationally expensive matrix manipulations. To deal with large networks, in this paper, we propose an efficient graph clustering algorithm, KCoreMotif, specifically for large networks by exploiting k-core decomposition and motifs. The essential idea behind the proposed clustering algorithm is to perform the efficient motif-based spectral clustering algorithm on k-core subgraphs, rather than on the entire graph. More specifically, (1) we first conduct the k-core decomposition of the large input network; (2) we then perform the motif-based spectral clustering for the top k-core subgraphs; (3) we group the remaining vertices in the rest (k-1)-core subgraphs into previously found clusters; and finally obtain the desired clusters of the large input network. To evaluate the performance of the proposed graph clustering algorithm KCoreMotif, we use both the conventional and the motif-based spectral clustering algorithms as the baselines and compare our algorithm with them for 18 groups of real-world datasets. Comparative results demonstrate that the proposed graph clustering algorithm is accurate yet efficient for large networks, which also means that it can be further used to evaluate the intra-cluster and inter-cluster trusts on large networks.