T-REGS: Minimum Spanning Tree Regularization for Self-Supervised Learning
This work addresses representation quality issues in self-supervised learning, which is crucial for AI systems relying on unlabeled data, but it appears incremental as it builds on existing SSL paradigms with a new regularization method.
The paper tackles the problem of improving self-supervised learning representations by addressing dimensional collapse and lack of uniformity, introducing T-REGS, a regularization framework based on Minimum Spanning Tree length, which enhances representation quality as validated on synthetic data and SSL benchmarks.
Self-supervised learning (SSL) has emerged as a powerful paradigm for learning representations without labeled data, often by enforcing invariance to input transformations such as rotations or blurring. Recent studies have highlighted two pivotal properties for effective representations: (i) avoiding dimensional collapse-where the learned features occupy only a low-dimensional subspace, and (ii) enhancing uniformity of the induced distribution. In this work, we introduce T-REGS, a simple regularization framework for SSL based on the length of the Minimum Spanning Tree (MST) over the learned representation. We provide theoretical analysis demonstrating that T-REGS simultaneously mitigates dimensional collapse and promotes distribution uniformity on arbitrary compact Riemannian manifolds. Several experiments on synthetic data and on classical SSL benchmarks validate the effectiveness of our approach at enhancing representation quality.