Deep Bregman Divergence for Contrastive Learning of Visual Representations
This work addresses the challenge of enhancing self-supervised and semi-supervised learning for computer vision, offering incremental improvements in representation quality and generalization.
The paper tackles the problem of improving contrastive learning for visual representations by proposing a deep Bregman divergence framework that captures divergence between distributions, rather than just single points, resulting in outperforming baseline and most previous methods on multiple classification and object detection tasks and datasets.
Deep Bregman divergence measures divergence of data points using neural networks which is beyond Euclidean distance and capable of capturing divergence over distributions. In this paper, we propose deep Bregman divergences for contrastive learning of visual representation where we aim to enhance contrastive loss used in self-supervised learning by training additional networks based on functional Bregman divergence. In contrast to the conventional contrastive learning methods which are solely based on divergences between single points, our framework can capture the divergence between distributions which improves the quality of learned representation. We show the combination of conventional contrastive loss and our proposed divergence loss outperforms baseline and most of the previous methods for self-supervised and semi-supervised learning on multiple classifications and object detection tasks and datasets. Moreover, the learned representations generalize well when transferred to the other datasets and tasks. The source code and our models are available in supplementary and will be released with paper.