DS LGJul 2, 2025

Dynamic Similarity Graph Construction with Kernel Density Estimation

arXiv:2507.01696v11.2h-index: 4ICML

Originality Incremental advance

AI Analysis

This work addresses the need for scalable dynamic clustering in machine learning, though it appears incremental as it builds on existing kernel density estimation and graph construction methods.

The paper tackles the problem of efficiently maintaining kernel density estimates and similarity graphs in dynamic settings where data points are added over time, resulting in a fast dynamic spectral clustering algorithm that was evaluated on synthetic and real-world datasets.

In the kernel density estimation (KDE) problem, we are given a set $X$ of data points in $\mathbb{R}^d$, a kernel function $k: \mathbb{R}^d \times \mathbb{R}^d \rightarrow \mathbb{R}$, and a query point $\mathbf{q} \in \mathbb{R}^d$, and the objective is to quickly output an estimate of $\sum_{\mathbf{x} \in X} k(\mathbf{q}, \mathbf{x})$. In this paper, we consider $\textsf{KDE}$ in the dynamic setting, and introduce a data structure that efficiently maintains the estimates for a set of query points as data points are added to $X$ over time. Based on this, we design a dynamic data structure that maintains a sparse approximation of the fully connected similarity graph on $X$, and develop a fast dynamic spectral clustering algorithm. We further evaluate the effectiveness of our algorithms on both synthetic and real-world datasets.

View on arXiv PDF

Similar