MLMar 10, 2017

Density Level Set Estimation on Manifolds with DBSCAN

arXiv:1703.03503v234 citations
Originality Incremental advance
AI Analysis

This provides a theoretical foundation for density-based clustering in high-dimensional and manifold settings, though it is incremental as it extends existing DBSCAN analysis.

The paper tackles the problem of estimating density level sets from samples, showing that DBSCAN can achieve optimal rates in Euclidean space and on manifolds, with adaptive tuning for unknown parameters.

We show that DBSCAN can estimate the connected components of the $λ$-density level set $\{ x : f(x) \ge λ\}$ given $n$ i.i.d. samples from an unknown density $f$. We characterize the regularity of the level set boundaries using parameter $β> 0$ and analyze the estimation error under the Hausdorff metric. When the data lies in $\mathbb{R}^D$ we obtain a rate of $\widetilde{O}(n^{-1/(2β+ D)})$, which matches known lower bounds up to logarithmic factors. When the data lies on an embedded unknown $d$-dimensional manifold in $\mathbb{R}^D$, then we obtain a rate of $\widetilde{O}(n^{-1/(2β+ d\cdot \max\{1, β\})})$. Finally, we provide adaptive parameter tuning in order to attain these rates with no a priori knowledge of the intrinsic dimension, density, or $β$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes