CVJun 7, 2016

ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation

arXiv:1606.02147v12337 citations
Originality Highly original
AI Analysis

This work addresses the need for low-latency semantic segmentation in mobile applications, representing a strong specific gain in efficiency.

The paper tackles the problem of real-time semantic segmentation for mobile applications by proposing ENet, an efficient neural network architecture that achieves up to 18x faster speed, 75x fewer FLOPs, and 79x fewer parameters while maintaining similar or better accuracy compared to existing models.

The ability to perform pixel-wise semantic segmentation in real-time is of paramount importance in mobile applications. Recent deep neural networks aimed at this task have the disadvantage of requiring a large number of floating point operations and have long run-times that hinder their usability. In this paper, we propose a novel deep neural network architecture named ENet (efficient neural network), created specifically for tasks requiring low latency operation. ENet is up to 18$\times$ faster, requires 75$\times$ less FLOPs, has 79$\times$ less parameters, and provides similar or better accuracy to existing models. We have tested it on CamVid, Cityscapes and SUN datasets and report on comparisons with existing state-of-the-art methods, and the trade-offs between accuracy and processing time of a network. We present performance measurements of the proposed architecture on embedded systems and suggest possible software improvements that could make ENet even faster.

Code Implementations49 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes