MLLGNASTFeb 13, 2023

Kernelized Diffusion maps

arXiv:2302.06757v19 citationsh-index: 108
Originality Highly original
AI Analysis

This addresses a fundamental bottleneck in dimensionality reduction for high-dimensional data analysis, offering a theoretically sound solution with potential computational optimizations.

The paper tackles the curse of dimensionality in spectral clustering and diffusion maps by introducing a kernel-based Laplacian estimator that adapts to data regularity, proving non-asymptotic statistical rates to circumvent this issue.

Spectral clustering and diffusion maps are celebrated dimensionality reduction algorithms built on eigen-elements related to the diffusive structure of the data. The core of these procedures is the approximation of a Laplacian through a graph kernel approach, however this local average construction is known to be cursed by the high-dimension d. In this article, we build a different estimator of the Laplacian, via a reproducing kernel Hilbert space method, which adapts naturally to the regularity of the problem. We provide non-asymptotic statistical rates proving that the kernel estimator we build can circumvent the curse of dimensionality. Finally we discuss techniques (Nyström subsampling, Fourier features) that enable to reduce the computational cost of the estimator while not degrading its overall performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes