Locally Adaptive Learning Loss for Semantic Image Segmentation
This work addresses the challenge of enhancing sensitivity to regional connections in segmentation for computer vision applications, but it appears incremental as it builds on existing loss layer methods.
The paper tackles the problem of semantic image segmentation by proposing a locally adaptive learning estimator to improve inter- and intra-class discriminative capabilities, resulting in consistently improved segmentation masks on the Pascal VOC 2012 dataset.
We propose a novel locally adaptive learning estimator for enhancing the inter- and intra- discriminative capabilities of Deep Neural Networks, which can be used as improved loss layer for semantic image segmentation tasks. Most loss layers compute pixel-wise cost between feature maps and ground truths, ignoring spatial layouts and interactions between neighboring pixels with same object category, and thus networks cannot be effectively sensitive to intra-class connections. Stride by stride, our method firstly conducts adaptive pooling filter operating over predicted feature maps, aiming to merge predicted distributions over a small group of neighboring pixels with same category, and then it computes cost between the merged distribution vector and their category label. Such design can make groups of neighboring predictions from same category involved into estimations on predicting correctness with respect to their category, and hence train networks to be more sensitive to regional connections between adjacent pixels based on their categories. In the experiments on Pascal VOC 2012 segmentation datasets, the consistently improved results show that our proposed approach achieves better segmentation masks against previous counterparts.