LGATMLFeb 2, 2024

Mapping the Multiverse of Latent Representations

arXiv:2402.01514v211 citationsh-index: 6ICML
AI Analysis

This work addresses reliability and robustness issues in machine learning for researchers and practitioners using latent representation models, though it is incremental as it builds on existing multiverse analysis concepts.

The authors tackled the problem of variability and unreliability in latent representations of machine learning models by introducing PRESTO, a framework that uses persistent homology to map and compare latent spaces across different methods, hyperparameters, and datasets, enabling sensitivity analysis and anomaly detection.

Echoing recent calls to counter reliability and robustness concerns in machine learning via multiverse analysis, we present PRESTO, a principled framework for mapping the multiverse of machine-learning models that rely on latent representations. Although such models enjoy widespread adoption, the variability in their embeddings remains poorly understood, resulting in unnecessary complexity and untrustworthy representations. Our framework uses persistent homology to characterize the latent spaces arising from different combinations of diverse machine-learning methods, (hyper)parameter configurations, and datasets, allowing us to measure their pairwise (dis)similarity and statistically reason about their distributions. As we demonstrate both theoretically and empirically, our pipeline preserves desirable properties of collections of latent representations, and it can be leveraged to perform sensitivity analysis, detect anomalous embeddings, or efficiently and effectively navigate hyperparameter search spaces.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes