CV MLNov 24, 2021

MIO : Mutual Information Optimization using Self-Supervised Binary Contrastive Learning

Siladittya Manna, Umapada Pal, Saumik Bhattacharya

arXiv:2111.12664v31.4

Originality Incremental advance

AI Analysis

This work addresses the need for more effective self-supervised learning methods in computer vision, offering incremental improvements over existing contrastive frameworks.

The paper tackles the problem of improving self-supervised contrastive learning by proposing a novel loss function that optimizes mutual information in positive and negative pairs, resulting in state-of-the-art accuracy gains of up to 4.85% on benchmark datasets like CIFAR-10 and STL-10.

Self-supervised contrastive learning frameworks have progressed rapidly over the last few years. In this paper, we propose a novel loss function for contrastive learning. We model our pre-training task as a binary classification problem to induce an implicit contrastive effect. We further improve the näive loss function after removing the effect of the positive-positive repulsion and incorporating the upper bound of the negative pair repulsion. Unlike existing methods, the proposed loss function optimizes the mutual information in positive and negative pairs. We also present a closed-form expression for the parameter gradient flow and compare the behaviour of self-supervised contrastive frameworks using Hessian eigenspectrum to analytically study their convergence. The proposed method outperforms SOTA self-supervised contrastive frameworks on benchmark datasets such as CIFAR-10, CIFAR-100, STL-10, and Tiny-ImageNet. After 200 pretraining epochs with ResNet-18 as the backbone, the proposed model achieves an accuracy of 86.36%, 58.18%, 80.50%, and 30.87% on the CIFAR-10, CIFAR-100, STL-10, and Tiny-ImageNet datasets, respectively, and surpasses the SOTA contrastive baseline by 1.93%, 3.57%, 4.85%, and 0.33%, respectively. The proposed framework also achieves a state-of-the-art accuracy of 78.4% (200 epochs) and 65.22% (100 epochs) Top-1 Linear Evaluation accuracy on ImageNet100 and ImageNet1K datasets, respectively.

View on arXiv PDF

Similar