LGOct 11, 2023

Self-supervised Representation Learning From Random Data Projectors

Yi Sui, Tongzi Wu, Jesse C. Cresswell, Ga Wu, George Stein, Xiao Shi Huang, Xiaochen Zhang, Maksims Volkovs

arXiv:2310.07756v214.318 citationsh-index: 25Has Code

Originality Highly original

AI Analysis

This provides a broadly applicable SSRL method for any data modality, addressing a key bottleneck in the field.

The paper tackles the problem of self-supervised representation learning being limited by data augmentation constraints across modalities by proposing an approach that learns representations by reconstructing random data projections, and it shows that this method outperforms state-of-the-art baselines on diverse tasks.

Self-supervised representation learning~(SSRL) has advanced considerably by exploiting the transformation invariance assumption under artificially designed data augmentations. While augmentation-based SSRL algorithms push the boundaries of performance in computer vision and natural language processing, they are often not directly applicable to other data modalities, and can conflict with application-specific data augmentation constraints. This paper presents an SSRL approach that can be applied to any data modality and network architecture because it does not rely on augmentations or masking. Specifically, we show that high-quality data representations can be learned by reconstructing random data projections. We evaluate the proposed approach on a wide range of representation learning tasks that span diverse modalities and real-world applications. We show that it outperforms multiple state-of-the-art SSRL baselines. Due to its wide applicability and strong empirical results, we argue that learning from randomness is a fruitful research direction worthy of attention and further study.

View on arXiv PDF Code

Similar