CVJun 24, 2019

ESNet: An Efficient Symmetric Network for Real-time Semantic Segmentation

arXiv:1906.09826v172 citations
Originality Incremental advance
AI Analysis

This addresses the problem of real-time semantic segmentation for scenarios with limited computational resources, representing an incremental improvement in efficiency.

The paper tackles the computational inefficiency of deep convolutional neural networks for semantic segmentation by proposing ESNet, an efficient symmetric network that achieves state-of-the-art speed-accuracy trade-off, with 1.6M parameters and over 62 FPS on a GTX 1080Ti GPU on the CityScapes dataset.

The recent years have witnessed great advances for semantic segmentation using deep convolutional neural networks (DCNNs). However, a large number of convolutional layers and feature channels lead to semantic segmentation as a computationally heavy task, which is disadvantage to the scenario with limited resources. In this paper, we design an efficient symmetric network, called (ESNet), to address this problem. The whole network has nearly symmetric architecture, which is mainly composed of a series of factorized convolution unit (FCU) and its parallel counterparts (PFCU). On one hand, the FCU adopts a widely-used 1D factorized convolution in residual layers. On the other hand, the parallel version employs a transform-split-transform-merge strategy in the designment of residual module, where the split branch adopts dilated convolutions with different rate to enlarge receptive field. Our model has nearly 1.6M parameters, and is able to be performed over 62 FPS on a single GTX 1080Ti GPU. The experiments demonstrate that our approach achieves state-of-the-art results in terms of speed and accuracy trade-off for real-time semantic segmentation on CityScapes dataset.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes