Unsupervised Visible-light Images Guided Cross-Spectrum Depth Estimation from Dual-Modality Cameras
This work addresses a challenging problem for autonomous vehicle applications by enabling depth estimation in all illumination conditions, though it is incremental as it builds on existing depth estimation networks with novel adaptations.
The paper tackles cross-spectrum depth estimation from dual-modality cameras (thermal and visible-light) by proposing an unsupervised framework that transfers features and uses depth cycle consistency, achieving better performance than existing methods. It also releases a large dataset of visible-light and far-infrared stereo images to address the shortage of open-source data in this field.
Cross-spectrum depth estimation aims to provide a depth map in all illumination conditions with a pair of dual-spectrum images. It is valuable for autonomous vehicle applications when the vehicle is equipped with two cameras of different modalities. However, images captured by different-modality cameras can be photometrically quite different. Therefore, cross-spectrum depth estimation is a very challenging problem. Moreover, the shortage of large-scale open-source datasets also retards further research in this field. In this paper, we propose an unsupervised visible-light image guided cross-spectrum (i.e., thermal and visible-light, TIR-VIS in short) depth estimation framework given a pair of RGB and thermal images captured from a visible-light camera and a thermal one. We first adopt a base depth estimation network using RGB-image pairs. Then we propose a multi-scale feature transfer network to transfer features from the TIR-VIS domain to the VIS domain at the feature level to fit the trained depth estimation network. At last, we propose a cross-spectrum depth cycle consistency to improve the depth result of dual-spectrum image pairs. Meanwhile, we release a large dual-spectrum depth estimation dataset with visible-light and far-infrared stereo images captured in different scenes to the society. The experiment result shows that our method achieves better performance than the compared existing methods. Our datasets is available at https://github.com/whitecrow1027/VIS-TIR-Datasets.