CVLGJan 12, 2021

Explicit homography estimation improves contrastive self-supervised learning

arXiv:2101.04713v1
Originality Incremental advance
AI Analysis

This incremental improvement addresses compute efficiency for researchers and practitioners using self-supervised learning.

The paper tackles the bottleneck of compute needed in contrastive self-supervised learning by proposing an additional module to regress affine or homography parameters, which improves performance and learning speed across datasets, with affine transformations performing better in all cases.

The typical contrastive self-supervised algorithm uses a similarity measure in latent space as the supervision signal by contrasting positive and negative images directly or indirectly. Although the utility of self-supervised algorithms has improved recently, there are still bottlenecks hindering their widespread use, such as the compute needed. In this paper, we propose a module that serves as an additional objective in the self-supervised contrastive learning paradigm. We show how the inclusion of this module to regress the parameters of an affine transformation or homography, in addition to the original contrastive objective, improves both performance and learning speed. Importantly, we ensure that this module does not enforce invariance to the various components of the affine transform, as this is not always ideal. We demonstrate the effectiveness of the additional objective on two recent, popular self-supervised algorithms. We perform an extensive experimental analysis of the proposed method and show an improvement in performance for all considered datasets. Further, we find that although both the general homography and affine transformation are sufficient to improve performance and convergence, the affine transformation performs better in all cases.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes