LG CV SPMay 26, 2023

Modulate Your Spectrum in Self-Supervised Learning

Xi Weng, Yunhao Ni, Tengwei Song, Jie Luo, Rao Muhammad Anwer, Salman Khan, Fahad Shahbaz Khan, Lei Huang

arXiv:2305.16789v210.710 citationsHas Code

Originality Highly original

AI Analysis

This work addresses a fundamental issue in self-supervised learning for computer vision, offering a novel method to enhance representation quality, though it is incremental as it builds upon existing whitening loss approaches.

The paper tackled the problem of feature collapse in self-supervised learning by introducing Spectral Transformation (ST), a framework to modulate embedding spectra, and proposed a specific instance called INTL that prevents collapse and improves representation learning, achieving superior results on ImageNet classification and COCO object detection.

Whitening loss offers a theoretical guarantee against feature collapse in self-supervised learning (SSL) with joint embedding architectures. Typically, it involves a hard whitening approach, transforming the embedding and applying loss to the whitened output. In this work, we introduce Spectral Transformation (ST), a framework to modulate the spectrum of embedding and to seek for functions beyond whitening that can avoid dimensional collapse. We show that whitening is a special instance of ST by definition, and our empirical investigations unveil other ST instances capable of preventing collapse. Additionally, we propose a novel ST instance named IterNorm with trace loss (INTL). Theoretical analysis confirms INTL's efficacy in preventing collapse and modulating the spectrum of embedding toward equal-eigenvalues during optimization. Our experiments on ImageNet classification and COCO object detection demonstrate INTL's potential in learning superior representations. The code is available at https://github.com/winci-ai/INTL.

View on arXiv PDF Code

Similar