CVROJul 19, 2020

Beyond Single Stage Encoder-Decoder Networks: Deep Decoders for Semantic Image Segmentation

arXiv:2007.09746v12 citations
AI Analysis

This work addresses efficiency and segmentation quality issues in semantic image segmentation, which is crucial for applications like autonomous driving and robotics, though it appears incremental as it builds on existing encoder-decoder methodologies.

The authors tackled the limitations of single encoder-decoder networks for semantic image segmentation by proposing a new architecture with a decoder using shallow networks, novel skip connections, and a class re-balancing weight function, achieving state-of-the-art results on CamVid, Gatech, and Freiburg Forest datasets.

Single encoder-decoder methodologies for semantic segmentation are reaching their peak in terms of segmentation quality and efficiency per number of layers. To address these limitations, we propose a new architecture based on a decoder which uses a set of shallow networks for capturing more information content. The new decoder has a new topology of skip connections, namely backward and stacked residual connections. In order to further improve the architecture we introduce a weight function which aims to re-balance classes to increase the attention of the networks to under-represented objects. We carried out an extensive set of experiments that yielded state-of-the-art results for the CamVid, Gatech and Freiburg Forest datasets. Moreover, to further prove the effectiveness of our decoder, we conducted a set of experiments studying the impact of our decoder to state-of-the-art segmentation techniques. Additionally, we present a set of experiments augmenting semantic segmentation with optical flow information, showing that motion clues can boost pure image based semantic segmentation approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes