LGDCNADec 8, 2022

A Distributed Block Chebyshev-Davidson Algorithm for Parallel Spectral Clustering

arXiv:2212.04443v27 citationsh-index: 30
AI Analysis

This work addresses the computational bottleneck in spectral clustering for big data applications, offering a parallel solution with incremental improvements over existing eigensolvers.

The paper tackles the problem of solving large-scale leading eigenvalue problems for spectral clustering by developing a distributed Block Chebyshev-Davidson algorithm, which improves efficiency through analytic spectrum estimation and achieves scalability with a speedup approximately proportional to the square root of the number of processes.

We develop a distributed Block Chebyshev-Davidson algorithm to solve large-scale leading eigenvalue problems for spectral analysis in spectral clustering. First, the efficiency of the Chebyshev-Davidson algorithm relies on the prior knowledge of the eigenvalue spectrum, which could be expensive to estimate. This issue can be lessened by the analytic spectrum estimation of the Laplacian or normalized Laplacian matrices in spectral clustering, making the proposed algorithm very efficient for spectral clustering. Second, to make the proposed algorithm capable of analyzing big data, a distributed and parallel version has been developed with attractive scalability. The speedup by parallel computing is approximately equivalent to $\sqrt{p}$, where $p$ denotes the number of processes. {Numerical results will be provided to demonstrate its efficiency in spectral clustering and scalability advantage over existing eigensolvers used for spectral clustering in parallel computing environments.}

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes