CVDec 24, 2021

Realtime Global Attention Network for Semantic Segmentation

arXiv:2112.12939v11.4Has Code

Originality Incremental advance

AI Analysis

This work addresses real-time semantic segmentation for robotic monocular visual perception, representing an incremental improvement with a novel attention method.

The authors tackled real-time semantic segmentation for robotic vision by proposing RGANet, which uses a global attention module based on depth-wise convolution and affine transformations, achieving leading performance on state-of-the-art benchmarks.

In this paper, we proposed an end-to-end realtime global attention neural network (RGANet) for the challenging task of semantic segmentation. Different from the encoding strategy deployed by self-attention paradigms, the proposed global attention module encodes global attention via depth-wise convolution and affine transformations. The integration of these global attention modules into a hierarchy architecture maintains high inferential performance. In addition, an improved evaluation metric, namely MGRID, is proposed to alleviate the negative effect of non-convex, widely scattered ground-truth areas. Results from extensive experiments on state-of-the-art architectures for semantic segmentation manifest the leading performance of proposed approaches for robotic monocular visual perception.

View on arXiv PDF Code

Similar