PST900: RGB-Thermal Calibration, Dataset and Segmentation Network
This work addresses semantic segmentation for robotics in subterranean settings, but it is incremental as it builds on existing learning-based techniques with a new dataset and architecture.
The authors tackled the problem of semantic segmentation in challenging environments by introducing a calibrated RGB-thermal dataset and a CNN architecture, showing that their method outperforms state-of-the-art approaches on their dataset.
In this work we propose long wave infrared (LWIR) imagery as a viable supporting modality for semantic segmentation using learning-based techniques. We first address the problem of RGB-thermal camera calibration by proposing a passive calibration target and procedure that is both portable and easy to use. Second, we present PST900, a dataset of 894 synchronized and calibrated RGB and Thermal image pairs with per pixel human annotations across four distinct classes from the DARPA Subterranean Challenge. Lastly, we propose a CNN architecture for fast semantic segmentation that combines both RGB and Thermal imagery in a way that leverages RGB imagery independently. We compare our method against the state-of-the-art and show that our method outperforms them in our dataset.