Directional Self-supervised Learning for Heavy Image Augmentations
This work addresses a bottleneck in self-supervised learning for computer vision by making it compatible with a wider range of augmentations, though it is incremental as it builds on existing frameworks like SimCLR.
The paper tackles the problem of limited augmentation compatibility in self-supervised image representation learning by proposing a directional self-supervised learning paradigm (DSSL) that treats augmented views as a partially ordered set, enabling stable improvements across various baselines on CIFAR and ImageNet datasets.
Despite the large augmentation family, only a few cherry-picked robust augmentation policies are beneficial to self-supervised image representation learning. In this paper, we propose a directional self-supervised learning paradigm (DSSL), which is compatible with significantly more augmentations. Specifically, we adapt heavy augmentation policies after the views lightly augmented by standard augmentations, to generate harder view (HV). HV usually has a higher deviation from the original image than the lightly augmented standard view (SV). Unlike previous methods equally pairing all augmented views to symmetrically maximize their similarities, DSSL treats augmented views of the same instance as a partially ordered set (with directions as SV$\leftrightarrow $SV, SV$\leftarrow$HV), and then equips a directional objective function respecting to the derived relationships among views. DSSL can be easily implemented with a few lines of codes and is highly flexible to popular self-supervised learning frameworks, including SimCLR, SimSiam, BYOL. Extensive experimental results on CIFAR and ImageNet demonstrated that DSSL can stably improve various baselines with compatibility to a wider range of augmentations.