DSLGApr 4, 2022

The Fast Johnson-Lindenstrauss Transform is Even Faster

arXiv:2204.01800v17 citationsh-index: 29
Originality Incremental advance
AI Analysis

This work provides an incremental improvement to a foundational algorithm in machine learning and data science, benefiting applications requiring efficient high-dimensional data processing.

The paper tackles the problem of improving the embedding time of the Fast Johnson-Lindenstrauss transform for dimensionality reduction, achieving a speedup by a factor of α = Ω(min{ε⁻¹ln(1/ε), ln n}) in the k ln² n term, with a tight lower bound to confirm optimality.

The seminal Fast Johnson-Lindenstrauss (Fast JL) transform by Ailon and Chazelle (SICOMP'09) embeds a set of $n$ points in $d$-dimensional Euclidean space into optimal $k=O(\varepsilon^{-2} \ln n)$ dimensions, while preserving all pairwise distances to within a factor $(1 \pm \varepsilon)$. The Fast JL transform supports computing the embedding of a data point in $O(d \ln d +k \ln^2 n)$ time, where the $d \ln d$ term comes from multiplication with a $d \times d$ Hadamard matrix and the $k \ln^2 n$ term comes from multiplication with a sparse $k \times d$ matrix. Despite the Fast JL transform being more than a decade old, it is one of the fastest dimensionality reduction techniques for many tradeoffs between $\varepsilon, d$ and $n$. In this work, we give a surprising new analysis of the Fast JL transform, showing that the $k \ln^2 n$ term in the embedding time can be improved to $(k \ln^2 n)/α$ for an $α= Ω(\min\{\varepsilon^{-1}\ln(1/\varepsilon), \ln n\})$. The improvement follows by using an even sparser matrix. We also complement our improved analysis with a lower bound showing that our new analysis is in fact tight.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes