LGMLDec 2, 2019

Using Dimensionality Reduction to Optimize t-SNE

arXiv:1912.01098v11 citations
Originality Incremental advance
AI Analysis

This addresses a bottleneck for researchers using t-SNE in unsupervised scenarios, though it is incremental as it builds on existing methods.

The paper tackled the high computational cost of t-SNE for high-dimensional data by using random projections to reduce dimensionality before applying t-SNE, resulting in preserved clustering and dramatically reduced runtime.

t-SNE is a popular tool for embedding multi-dimensional datasets into two or three dimensions. However, it has a large computational cost, especially when the input data has many dimensions. Many use t-SNE to embed the output of a neural network, which is generally of much lower dimension than the original data. This limits the use of t-SNE in unsupervised scenarios. We propose using \textit{random} projections to embed high dimensional datasets into relatively few dimensions, and then using t-SNE to obtain a two dimensional embedding. We show that random projections preserve the desirable clustering achieved by t-SNE, while dramatically reducing the runtime of finding the embedding.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes