Self-supervised Representation Learning From Random Data Projectors
This provides a broadly applicable SSRL method for any data modality, addressing a key bottleneck in the field.
The paper tackles the problem of self-supervised representation learning being limited by data augmentation constraints across modalities by proposing an approach that learns representations by reconstructing random data projections, and it shows that this method outperforms state-of-the-art baselines on diverse tasks.
Self-supervised representation learning~(SSRL) has advanced considerably by exploiting the transformation invariance assumption under artificially designed data augmentations. While augmentation-based SSRL algorithms push the boundaries of performance in computer vision and natural language processing, they are often not directly applicable to other data modalities, and can conflict with application-specific data augmentation constraints. This paper presents an SSRL approach that can be applied to any data modality and network architecture because it does not rely on augmentations or masking. Specifically, we show that high-quality data representations can be learned by reconstructing random data projections. We evaluate the proposed approach on a wide range of representation learning tasks that span diverse modalities and real-world applications. We show that it outperforms multiple state-of-the-art SSRL baselines. Due to its wide applicability and strong empirical results, we argue that learning from randomness is a fruitful research direction worthy of attention and further study.