CVSep 18, 2019

Feature Pyramid Encoding Network for Real-time Semantic Segmentation

arXiv:1909.08599v189 citations
Originality Incremental advance
AI Analysis

This work addresses the need for efficient real-time semantic segmentation in applications like autonomous driving, though it is incremental as it builds on existing real-time methods.

The paper tackles the problem of high computational cost and large parameter counts in semantic segmentation by proposing a lightweight network called FPENet, which achieves 68.0% mean IoU on Cityscapes with only 0.4M parameters and 102 FPS speed.

Although current deep learning methods have achieved impressive results for semantic segmentation, they incur high computational costs and have a huge number of parameters. For real-time applications, inference speed and memory usage are two important factors. To address the challenge, we propose a lightweight feature pyramid encoding network (FPENet) to make a good trade-off between accuracy and speed. Specifically, we use a feature pyramid encoding block to encode multi-scale contextual features with depthwise dilated convolutions in all stages of the encoder. A mutual embedding upsample module is introduced in the decoder to aggregate the high-level semantic features and low-level spatial details efficiently. The proposed network outperforms existing real-time methods with fewer parameters and improved inference speed on the Cityscapes and CamVid benchmark datasets. Specifically, FPENet achieves 68.0\% mean IoU on the Cityscapes test set with only 0.4M parameters and 102 FPS speed on an NVIDIA TITAN V GPU.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes