LGMLNov 20, 2017

Positive semi-definite embedding for dimensionality reduction and out-of-sample extensions

arXiv:1711.07271v42 citations
Originality Incremental advance
AI Analysis

This work addresses dimensionality reduction for machine learning and statistics, offering incremental improvements in robustness and out-of-sample handling.

The paper tackles the problem of dimensionality reduction by introducing a method that uses eigenvectors of a positive semi-definite kernel derived from an infinite-dimensional semi-definite program, resulting in an adaptive, non-linear embedding with out-of-sample extensions. Empirical results show this method is more robust to outliers compared to spectral embedding, though no specific numerical gains are provided.

In machine learning or statistics, it is often desirable to reduce the dimensionality of a sample of data points in a high dimensional space $\mathbb{R}^d$. This paper introduces a dimensionality reduction method where the embedding coordinates are the eigenvectors of a positive semi-definite kernel obtained as the solution of an infinite dimensional analogue of a semi-definite program. This embedding is adaptive and non-linear. We discuss this problem both with weak and strong smoothness assumptions about the learned kernel. A main feature of our approach is the existence of an out-of-sample extension formula of the embedding coordinates in both cases. This extrapolation formula yields an extension of the kernel matrix to a data-dependent Mercer kernel function. Our empirical results indicate that this embedding method is more robust with respect to the influence of outliers, compared with a spectral embedding method.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes