CV AI LGNov 28, 2021

ExCon: Explanation-driven Supervised Contrastive Learning for Image Classification

Zhibo Zhang, Jongseong Jang, Chiheb Trabelsi, Ruiwen Li, Scott Sanner, Yeonjeong Jeong, Dongsub Shim

arXiv:2111.14271v63.75 citationsHas Code

Originality Highly original

AI Analysis

This work addresses a key limitation in contrastive learning for image classification by preserving semantic content during augmentation, offering improvements in multiple performance metrics for researchers and practitioners in computer vision.

The paper tackled the problem of contrastive learning methods altering image semantics during augmentation, which harms downstream task performance, by proposing ExCon, which uses saliency-based explanations to create content-preserving augmentations. The result showed that ExCon outperformed vanilla supervised contrastive learning on CIFAR-100 and Tiny ImageNet datasets, improving classification, explanation quality, adversarial robustness, and probabilistic calibration under distributional shift.

Contrastive learning has led to substantial improvements in the quality of learned embedding representations for tasks such as image classification. However, a key drawback of existing contrastive augmentation methods is that they may lead to the modification of the image content which can yield undesired alterations of its semantics. This can affect the performance of the model on downstream tasks. Hence, in this paper, we ask whether we can augment image data in contrastive learning such that the task-relevant semantic content of an image is preserved. For this purpose, we propose to leverage saliency-based explanation methods to create content-preserving masked augmentations for contrastive learning. Our novel explanation-driven supervised contrastive learning (ExCon) methodology critically serves the dual goals of encouraging nearby image embeddings to have similar content and explanation. To quantify the impact of ExCon, we conduct experiments on the CIFAR-100 and the Tiny ImageNet datasets. We demonstrate that ExCon outperforms vanilla supervised contrastive learning in terms of classification, explanation quality, adversarial robustness as well as probabilistic calibration in the context of distributional shift.

View on arXiv PDF Code

Similar