InverseForm: A Loss Function for Structured Boundary-Aware Segmentation
This work addresses boundary refinement in semantic segmentation for computer vision applications, offering a plug-in solution that enhances existing models without added computational cost, though it is incremental as it builds on standard segmentation backbones.
The authors tackled the problem of improving semantic segmentation accuracy by introducing a boundary-aware loss term that learns parametric transformations between estimated and target boundaries, resulting in consistent performance gains across multiple benchmarks, including setting new state-of-the-art on two datasets.
We present a novel boundary-aware loss term for semantic segmentation using an inverse-transformation network, which efficiently learns the degree of parametric transformations between estimated and target boundaries. This plug-in loss term complements the cross-entropy loss in capturing boundary transformations and allows consistent and significant performance improvement on segmentation backbone models without increasing their size and computational complexity. We analyze the quantitative and qualitative effects of our loss function on three indoor and outdoor segmentation benchmarks, including Cityscapes, NYU-Depth-v2, and PASCAL, integrating it into the training phase of several backbone networks in both single-task and multi-task settings. Our extensive experiments show that the proposed method consistently outperforms baselines, and even sets the new state-of-the-art on two datasets.