LG CVApr 6, 2025

Variational Self-Supervised Learning

arXiv:2504.04318v39.41 citationsh-index: 5

Originality Highly original

AI Analysis

This work addresses the need for scalable, probabilistically grounded representation learning in machine learning, offering a novel integration of variational modeling with self-supervised techniques.

The paper tackles the problem of efficient, decoder-free representation learning by combining variational inference with self-supervised learning, achieving competitive or superior performance to leading methods like BYOL and MoCo V3 on datasets such as CIFAR-10, CIFAR-100, and ImageNet-100.

We present Variational Self-Supervised Learning (VSSL), a novel framework that combines variational inference with self-supervised learning to enable efficient, decoder-free representation learning. Unlike traditional VAEs that rely on input reconstruction via a decoder, VSSL symmetrically couples two encoders with Gaussian outputs. A momentum-updated teacher network defines a dynamic, data-dependent prior, while the student encoder produces an approximate posterior from augmented views. The reconstruction term in the ELBO is replaced with a cross-view denoising objective, preserving the analytical tractability of Gaussian KL divergence. We further introduce cosine-based formulations of KL and log-likelihood terms to enhance semantic alignment in high-dimensional latent spaces. Experiments on CIFAR-10, CIFAR-100, and ImageNet-100 show that VSSL achieves competitive or superior performance to leading self-supervised methods, including BYOL and MoCo V3. VSSL offers a scalable, probabilistically grounded approach to learning transferable representations without generative reconstruction, bridging the gap between variational modeling and modern self-supervised techniques.

View on arXiv PDF

Similar