CVDec 24, 2021

Realtime Global Attention Network for Semantic Segmentation

arXiv:2112.12939v1
Originality Incremental advance
AI Analysis

This work addresses real-time semantic segmentation for robotic monocular visual perception, representing an incremental improvement with a novel attention method.

The authors tackled real-time semantic segmentation for robotic vision by proposing RGANet, which uses a global attention module based on depth-wise convolution and affine transformations, achieving leading performance on state-of-the-art benchmarks.

In this paper, we proposed an end-to-end realtime global attention neural network (RGANet) for the challenging task of semantic segmentation. Different from the encoding strategy deployed by self-attention paradigms, the proposed global attention module encodes global attention via depth-wise convolution and affine transformations. The integration of these global attention modules into a hierarchy architecture maintains high inferential performance. In addition, an improved evaluation metric, namely MGRID, is proposed to alleviate the negative effect of non-convex, widely scattered ground-truth areas. Results from extensive experiments on state-of-the-art architectures for semantic segmentation manifest the leading performance of proposed approaches for robotic monocular visual perception.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes