LG IRDec 17, 2025

Topological Metric for Unsupervised Embedding Quality Evaluation

Aleksei Shestov, Anton Klenitskiy, Daria Denisova, Amurkhan Dzagkoev, Daniil Petrovich, Andrey Savchenko, Maksim Makarenko

arXiv:2512.15285v11 citationsh-index: 6

Originality Highly original

AI Analysis

This addresses a critical challenge in unsupervised representation learning for researchers and practitioners, enabling reliable model and hyperparameter selection.

The paper tackles the problem of evaluating embedding quality without labels by proposing Persistence, a topology-aware metric based on persistent homology, which consistently achieves top-tier correlations with downstream performance across diverse domains.

Modern representation learning increasingly relies on unsupervised and self-supervised methods trained on large-scale unlabeled data. While these approaches achieve impressive generalization across tasks and domains, evaluating embedding quality without labels remains an open challenge. In this work, we propose Persistence, a topology-aware metric based on persistent homology that quantifies the geometric structure and topological richness of embedding spaces in a fully unsupervised manner. Unlike metrics that assume linear separability or rely on covariance structure, Persistence captures global and multi-scale organization. Empirical results across diverse domains show that Persistence consistently achieves top-tier correlations with downstream performance, outperforming existing unsupervised metrics and enabling reliable model and hyperparameter selection.

View on arXiv PDF

Similar