LGIRDec 17, 2025

Topological Metric for Unsupervised Embedding Quality Evaluation

arXiv:2512.15285v11 citationsh-index: 6
Originality Highly original
AI Analysis

This addresses a critical challenge in unsupervised representation learning for researchers and practitioners, enabling reliable model and hyperparameter selection.

The paper tackles the problem of evaluating embedding quality without labels by proposing Persistence, a topology-aware metric based on persistent homology, which consistently achieves top-tier correlations with downstream performance across diverse domains.

Modern representation learning increasingly relies on unsupervised and self-supervised methods trained on large-scale unlabeled data. While these approaches achieve impressive generalization across tasks and domains, evaluating embedding quality without labels remains an open challenge. In this work, we propose Persistence, a topology-aware metric based on persistent homology that quantifies the geometric structure and topological richness of embedding spaces in a fully unsupervised manner. Unlike metrics that assume linear separability or rely on covariance structure, Persistence captures global and multi-scale organization. Empirical results across diverse domains show that Persistence consistently achieves top-tier correlations with downstream performance, outperforming existing unsupervised metrics and enabling reliable model and hyperparameter selection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes