LGAIJun 20, 2023

Relating tSNE and UMAP to Classical Dimensionality Reduction

arXiv:2306.11898v26 citationsh-index: 5
Originality Incremental advance
AI Analysis

This work addresses the problem of interpretability in explainability methods for AI researchers, though it appears incremental as it builds on existing techniques.

The paper tackles the lack of explainability in gradient-based dimensionality reduction methods like UMAP by relating them to classical techniques, showing that modified Locally Linear Embeddings can reproduce UMAP outputs and that classical methods can be recovered in the modern paradigm.

It has become standard to use gradient-based dimensionality reduction (DR) methods like tSNE and UMAP when explaining what AI models have learned. This makes sense: these methods are fast, robust, and have an uncanny ability to find semantic patterns in high-dimensional data without supervision. Despite this, gradient-based DR methods lack the most important quality that an explainability method should possess: themselves being explainable. That is, given a UMAP output, it is currently unclear what one can say about the corresponding input. We work towards closing this question by relating UMAP to classical DR techniques. Specifically, we show that one can fully recover methods like PCA, MDS, and ISOMAP in the modern DR paradigm: by applying attractions and repulsions onto a randomly initialized dataset. We also show that, with a small change, Locally Linear Embeddings (LLE) can indistinguishably reproduce UMAP outputs. This implies that the UMAP effective objective is minimized by this modified version of LLE (and vice versa). Given this, we discuss what must be true of UMAP emebddings and present avenues for future work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes