AILGAPMLAug 10, 2017

Automatic Selection of t-SNE Perplexity

arXiv:1708.03229v148 citations
Originality Incremental advance
AI Analysis

This work addresses a practical bottleneck for users of t-SNE in data visualization, offering an incremental improvement by automating hyperparameter tuning.

The paper tackles the problem of manually selecting the perplexity hyperparameter in t-SNE for data visualization by proposing an automatic selection method, which was empirically validated to align with human expert preferences across multiple datasets.

t-Distributed Stochastic Neighbor Embedding (t-SNE) is one of the most widely used dimensionality reduction methods for data visualization, but it has a perplexity hyperparameter that requires manual selection. In practice, proper tuning of t-SNE perplexity requires users to understand the inner working of the method as well as to have hands-on experience. We propose a model selection objective for t-SNE perplexity that requires negligible extra computation beyond that of the t-SNE itself. We empirically validate that the perplexity settings found by our approach are consistent with preferences elicited from human experts across a number of datasets. The similarities of our approach to Bayesian information criteria (BIC) and minimum description length (MDL) are also analyzed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes