LGOct 11, 2021

Density-Based Clustering with Kernel Diffusion

arXiv:2110.05096v3
Originality Incremental advance
AI Analysis

This addresses the problem of density estimation for clustering in complex datasets, particularly for face image analysis, though it appears incremental as it builds on existing density-based clustering frameworks.

The paper tackles the problem of density functions in clustering algorithms like DBSCAN and DPC, which often fail to capture local features in complex datasets, by proposing a kernel diffusion density function that adapts to varying local characteristics and smoothness. The result shows significant improvements over classic density-based methods and outperforms state-of-the-art face clustering methods by a large margin on benchmark and large-scale face image datasets.

Finding a suitable density function is essential for density-based clustering algorithms such as DBSCAN and DPC. A naive density corresponding to the indicator function of a unit $d$-dimensional Euclidean ball is commonly used in these algorithms. Such density suffers from capturing local features in complex datasets. To tackle this issue, we propose a new kernel diffusion density function, which is adaptive to data of varying local distributional characteristics and smoothness. Furthermore, we develop a surrogate that can be efficiently computed in linear time and space and prove that it is asymptotically equivalent to the kernel diffusion density function. Extensive empirical experiments on benchmark and large-scale face image datasets show that the proposed approach not only achieves a significant improvement over classic density-based clustering algorithms but also outperforms the state-of-the-art face clustering methods by a large margin.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes