CVJun 14, 2022

RGB-Multispectral Matching: Dataset, Learning Methodology, Evaluation

arXiv:2206.07047v110 citationsh-index: 44
Originality Incremental advance
AI Analysis

This addresses a novel, challenging task in computer vision for applications requiring cross-modal image alignment, though it is incremental in method.

The paper tackles the problem of registering synchronized RGB and multispectral images with different resolutions by solving stereo matching correspondences, achieving an average registration error of 1.16 pixels.

We address the problem of registering synchronized color (RGB) and multi-spectral (MS) images featuring very different resolution by solving stereo matching correspondences. Purposely, we introduce a novel RGB-MS dataset framing 13 different scenes in indoor environments and providing a total of 34 image pairs annotated with semi-dense, high-resolution ground-truth labels in the form of disparity maps. To tackle the task, we propose a deep learning architecture trained in a self-supervised manner by exploiting a further RGB camera, required only during training data acquisition. In this setup, we can conveniently learn cross-modal matching in the absence of ground-truth labels by distilling knowledge from an easier RGB-RGB matching task based on a collection of about 11K unlabeled image triplets. Experiments show that the proposed pipeline sets a good performance bar (1.16 pixels average registration error) for future research on this novel, challenging task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes