CVJun 15, 2022

Self-Supervised Learning of Image Scale and Orientation

arXiv:2206.07259v120 citationsh-index: 10
Originality Incremental advance
AI Analysis

This addresses the challenge of obtaining large-scale pose annotations for image regions, benefiting computer vision tasks like image matching and camera pose estimation, but it is incremental as it builds on self-supervised learning techniques.

The paper tackles the problem of learning to assign scale and orientation to image regions without explicit annotations by proposing a self-supervised framework with histogram alignment. It shows significant improvements in scale/orientation estimation and enhances image matching and 6 DoF camera pose estimation.

We study the problem of learning to assign a characteristic pose, i.e., scale and orientation, for an image region of interest. Despite its apparent simplicity, the problem is non-trivial; it is hard to obtain a large-scale set of image regions with explicit pose annotations that a model directly learns from. To tackle the issue, we propose a self-supervised learning framework with a histogram alignment technique. It generates pairs of image patches by random rescaling/rotating and then train an estimator to predict their scale/orientation values so that their relative difference is consistent with the rescaling/rotating used. The estimator learns to predict a non-parametric histogram distribution of scale/orientation without any supervision. Experiments show that it significantly outperforms previous methods in scale/orientation estimation and also improves image matching and 6 DoF camera pose estimation by incorporating our patch poses into a matching process.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes