AAG: Self-Supervised Representation Learning by Auxiliary Augmentation with GNT-Xent Loss
This is an incremental improvement for computer vision tasks lacking manual annotations, addressing memory and storage costs while boosting accuracy.
The paper tackles the limitations of augmentation-based contrastive learning in self-supervised representation learning by introducing AAG, which uses an auxiliary augmentation strategy and GNT-Xent loss to improve performance and efficiency, achieving 94.5% top-1 accuracy on CIFAR10, a 0.5% gain over SimCLR with a smaller batch size.
Self-supervised representation learning is an emerging research topic for its powerful capacity in learning with unlabeled data. As a mainstream self-supervised learning method, augmentation-based contrastive learning has achieved great success in various computer vision tasks that lack manual annotations. Despite current progress, the existing methods are often limited by extra cost on memory or storage, and their performance still has large room for improvement. Here we present a self-supervised representation learning method, namely AAG, which is featured by an auxiliary augmentation strategy and GNT-Xent loss. The auxiliary augmentation is able to promote the performance of contrastive learning by increasing the diversity of images. The proposed GNT-Xent loss enables a steady and fast training process and yields competitive accuracy. Experiment results demonstrate the superiority of AAG to previous state-of-the-art methods on CIFAR10, CIFAR100, and SVHN. Especially, AAG achieves 94.5% top-1 accuracy on CIFAR10 with batch size 64, which is 0.5% higher than the best result of SimCLR with batch size 1024.