CVFeb 15, 2024

VisIRNet: Deep Image Alignment for UAV-taken Visible and Infrared Image Pairs

arXiv:2402.09635v15.213 citationsh-index: 3Has CodeIEEE Trans Geosci Remote Sens

Originality Incremental advance

AI Analysis

This addresses multi-modal image alignment for UAV applications, offering an incremental improvement over existing deep LK-based methods.

The paper tackles the problem of aligning visible and infrared image pairs taken by UAVs, proposing a deep learning approach that achieves state-of-the-art results without using Lucas-Kanade methods, as demonstrated on four aerial datasets.

This paper proposes a deep learning based solution for multi-modal image alignment regarding UAV-taken images. Many recently proposed state-of-the-art alignment techniques rely on using Lucas-Kanade (LK) based solutions for a successful alignment. However, we show that we can achieve state of the art results without using LK-based methods. Our approach carefully utilizes a two-branch based convolutional neural network (CNN) based on feature embedding blocks. We propose two variants of our approach, where in the first variant (ModelA), we directly predict the new coordinates of only the four corners of the image to be aligned; and in the second one (ModelB), we predict the homography matrix directly. Applying alignment on the image corners forces algorithm to match only those four corners as opposed to computing and matching many (key)points, since the latter may cause many outliers, yielding less accurate alignment. We test our proposed approach on four aerial datasets and obtain state of the art results, when compared to the existing recent deep LK-based architectures.

View on arXiv PDF Code

Similar