LGJun 20, 2022

GiDR-DUN; Gradient Dimensionality Reduction -- Differences and Unification

arXiv:2206.09689v11 citationsh-index: 33Has Code
Originality Incremental advance
AI Analysis

This work addresses the computational bottleneck in dimensionality reduction for data scientists, though it appears incremental as it unifies existing methods rather than introducing a fundamentally new approach.

The authors tackled the problem of combining TSNE and UMAP to achieve TSNE-quality embeddings at UMAP speeds, showing that a single normalization parameter can switch between the two algorithms and proposing a new method, GDR, that optimizes faster than UMAP and an order of magnitude faster than TSNE.

TSNE and UMAP are two of the most popular dimensionality reduction algorithms due to their speed and interpretable low-dimensional embeddings. However, while attempts have been made to improve on TSNE's computational complexity, no existing method can obtain TSNE embeddings at the speed of UMAP. In this work, we show that this is indeed possible by combining the two approaches into a single method. We theoretically and experimentally evaluate the full space of parameters in the TSNE and UMAP algorithms and observe that a single parameter, the normalization, is responsible for switching between them. This, in turn, implies that a majority of the algorithmic differences can be toggled without affecting the embeddings. We discuss the implications this has on several theoretic claims underpinning the UMAP framework, as well as how to reconcile them with existing TSNE interpretations. Based on our analysis, we propose a new dimensionality reduction algorithm, GDR, that combines previously incompatible techniques from TSNE and UMAP and can replicate the results of either algorithm by changing the normalization. As a further advantage, GDR performs the optimization faster than available UMAP methods and thus an order of magnitude faster than available TSNE methods. Our implementation is plug-and-play with the traditional UMAP and TSNE libraries and can be found at github.com/Andrew-Draganov/GiDR-DUN.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes