cuSLINK: Single-linkage Agglomerative Clustering on the GPU
This work addresses scalability issues in clustering for data mining and machine learning applications, such as HDBSCAN, social networks, NLP, and computer vision, though it is incremental as it builds on the existing SLINK algorithm.
The paper tackles the computational bottleneck of single-linkage agglomerative clustering by proposing cuSLINK, a GPU-based algorithm that reduces space complexity to O(Nk) and enables previously intractable applications, achieving state-of-the-art performance.
In this paper, we propose cuSLINK, a novel and state-of-the-art reformulation of the SLINK algorithm on the GPU which requires only $O(Nk)$ space and uses a parameter $k$ to trade off space and time. We also propose a set of novel and reusable building blocks that compose cuSLINK. These building blocks include highly optimized computational patterns for $k$-NN graph construction, spanning trees, and dendrogram cluster extraction. We show how we used our primitives to implement cuSLINK end-to-end on the GPU, further enabling a wide range of real-world data mining and machine learning applications that were once intractable. In addition to being a primary computational bottleneck in the popular HDBSCAN algorithm, the impact of our end-to-end cuSLINK algorithm spans a large range of important applications, including cluster analysis in social and computer networks, natural language processing, and computer vision. Users can obtain cuSLINK at https://docs.rapids.ai/api/cuml/latest/api/#agglomerative-clustering