SpiderMesh: Spatial-aware Demand-guided Recursive Meshing for RGB-T Semantic Segmentation
This work addresses the problem of robust urban scene understanding for applications like autonomous driving by enhancing segmentation accuracy with thermal data, though it appears incremental as it builds on existing RGB-T methods.
The paper tackles semantic segmentation in urban scenes by combining RGB and thermal (RGB-T) data to improve performance in challenging lighting conditions, achieving state-of-the-art results on MFNet and PST900 datasets.
For semantic segmentation in urban scene understanding, RGB cameras alone often fail to capture a clear holistic topology in challenging lighting conditions. Thermal signal is an informative additional channel that can bring to light the contour and fine-grained texture of blurred regions in low-quality RGB image. Aiming at practical RGB-T (thermal) segmentation, we systematically propose a Spatial-aware Demand-guided Recursive Meshing (SpiderMesh) framework that: 1) proactively compensates inadequate contextual semantics in optically-impaired regions via a demand-guided target masking algorithm; 2) refines multimodal semantic features with recursive meshing to improve pixel-level semantic analysis performance. We further introduce an asymmetric data augmentation technique M-CutOut, and enable semi-supervised learning to fully utilize RGB-T labels only sparsely available in practical use. Extensive experiments on MFNet and PST900 datasets demonstrate that SpiderMesh achieves state-of-the-art performance on standard RGB-T segmentation benchmarks.