DS FA MLOct 22, 2018

Optimal terminal dimensionality reduction in Euclidean space

arXiv:1810.09250v110.853 citations

Originality Highly original

AI Analysis

This solves an open problem in theoretical computer science, providing optimal bounds for terminal dimensionality reduction, which is incremental but important for applications like nearest neighbor search.

The paper tackles the problem of dimensionality reduction in Euclidean space by strengthening the Johnson-Lindenstrauss lemma to preserve distances from all points in space to a given set, not just within the set, achieving an improved bound of m = O(ε^{-2} log n) compared to the previous O(ε^{-4} log n).

Let $\varepsilon\in(0,1)$ and $X\subset\mathbb R^d$ be arbitrary with $|X|$ having size $n>1$. The Johnson-Lindenstrauss lemma states there exists $f:X\rightarrow\mathbb R^m$ with $m = O(\varepsilon^{-2}\log n)$ such that $$ \forall x\in X\ \forall y\in X, \|x-y\|_2 \le \|f(x)-f(y)\|_2 \le (1+\varepsilon)\|x-y\|_2 . $$ We show that a strictly stronger version of this statement holds, answering one of the main open questions of [MMMR18]: "$\forall y\in X$" in the above statement may be replaced with "$\forall y\in\mathbb R^d$", so that $f$ not only preserves distances within $X$, but also distances to $X$ from the rest of space. Previously this stronger version was only known with the worse bound $m = O(\varepsilon^{-4}\log n)$. Our proof is via a tighter analysis of (a specific instantiation of) the embedding recipe of [MMMR18].

View on arXiv PDF

Similar