TriangleNet: Edge Prior Augmented Network for Semantic Segmentation through Cross-Task Consistency
This work addresses semantic segmentation for computer vision applications, offering incremental improvements through explicit cross-task consistency in multi-task learning.
The paper tackles semantic segmentation by proposing TriangleNet, which uses a decoupled cross-task consistency loss to enhance joint training with semantic edge detection, achieving a 2.88% mIoU improvement on Cityscapes and real-time inference at 77.4% mIoU/46.2 FPS.
This paper addresses the task of semantic segmentation in computer vision, aiming to achieve precise pixel-wise classification. We investigate the joint training of models for semantic edge detection and semantic segmentation, which has shown promise. However, implicit cross-task consistency learning in multi-task networks is limited. To address this, we propose a novel "decoupled cross-task consistency loss" that explicitly enhances cross-task consistency. Our semantic segmentation network, TriangleNet, achieves a substantial 2.88\% improvement over the Baseline in mean Intersection over Union (mIoU) on the Cityscapes test set. Notably, TriangleNet operates at 77.4\% mIoU/46.2 FPS on Cityscapes, showcasing real-time inference capabilities at full resolution. With multi-scale inference, performance is further enhanced to 77.8\%. Furthermore, TriangleNet consistently outperforms the Baseline on the FloodNet dataset, demonstrating its robust generalization capabilities. The proposed method underscores the significance of multi-task learning and explicit cross-task consistency enhancement for advancing semantic segmentation and highlights the potential of multitasking in real-time semantic segmentation.