DBIRFeb 8, 2020

Index-based Solutions for Efficient Density Peak Clustering

arXiv:2002.03182v22 citations
AI Analysis

This work addresses efficiency and parameter sensitivity issues in DPC for clustering applications, but it is incremental as it builds on existing index structures.

The paper tackles the sensitivity to parameter dc and inefficiency of Density Peak Clustering (DPC) by proposing index-based solutions, including list and tree indices with efficient query algorithms, and shows practical evaluation on six datasets to guide index selection.

Density Peak Clustering (DPC), a popular density-based clustering approach, has received considerable attention from the research community primarily due to its simplicity and fewer-parameter requirement. However, the resultant clusters obtained using DPC are influenced by the sensitive parameter $d_c$, which depends on data distribution and requirements of different users. Besides, the original DPC algorithm requires visiting a large number of objects, making it slow. To this end, this paper investigates index-based solutions for DPC. Specifically, we propose two list-based index methods viz. (i) a simple List Index, and (ii) an advanced Cumulative Histogram Index. Efficient query algorithms are proposed for these indices which significantly avoids irrelevant comparisons at the cost of space. For memory-constrained systems, we further introduce an approximate solution to the above indices which allows substantial reduction in the space cost, provided that slight inaccuracies are admissible. Furthermore, owing to considerably lower memory requirements of existing tree-based index structures, we also present effective pruning techniques and efficient query algorithms to support DPC using the popular Quadtree Index and R-tree Index. Finally, we practically evaluate all the above indices and present the findings and results, obtained from a set of extensive experiments on six synthetic and real datasets. The experimental insights obtained can help to guide in selecting a befitting index.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes