CVNov 27, 2024

Pixel-aligned RGB-NIR Stereo Imaging and Dataset for Robot Vision

arXiv:2411.18025v29 citationsh-index: 2CVPR
Originality Incremental advance
AI Analysis

This addresses a problem for robotics by providing aligned multi-spectral data to improve vision in challenging lighting, though it is incremental as it builds on existing imaging and dataset methods.

The paper tackles the lack of pixel-level alignment between RGB and NIR images in robotic vision by introducing a system with pixel-aligned RGB-NIR stereo cameras and LiDAR, along with a dataset under diverse lighting conditions, showing effectiveness in enhancing 3D vision.

Integrating RGB and NIR stereo imaging provides complementary spectral information, potentially enhancing robotic 3D vision in challenging lighting conditions. However, existing datasets and imaging systems lack pixel-level alignment between RGB and NIR images, posing challenges for downstream vision tasks. In this paper, we introduce a robotic vision system equipped with pixel-aligned RGB-NIR stereo cameras and a LiDAR sensor mounted on a mobile robot. The system simultaneously captures pixel-aligned pairs of RGB stereo images, NIR stereo images, and temporally synchronized LiDAR points. Utilizing the mobility of the robot, we present a dataset containing continuous video frames under diverse lighting conditions. We then introduce two methods that utilize the pixel-aligned RGB-NIR images: an RGB-NIR image fusion method and a feature fusion method. The first approach enables existing RGB-pretrained vision models to directly utilize RGB-NIR information without fine-tuning. The second approach fine-tunes existing vision models to more effectively utilize RGB-NIR information. Experimental results demonstrate the effectiveness of using pixel-aligned RGB-NIR images across diverse lighting conditions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes