CVAug 2, 2018

BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation

arXiv:1808.00897v12666 citations
Originality Incremental advance
AI Analysis

This addresses the problem of poor performance in real-time semantic segmentation for applications like autonomous driving, though it is an incremental improvement over existing methods.

The paper tackles the trade-off between spatial resolution and inference speed in real-time semantic segmentation by proposing the Bilateral Segmentation Network (BiSeNet), which achieves 68.4% Mean IOU on Cityscapes at 105 FPS.

Semantic segmentation requires both rich spatial information and sizeable receptive field. However, modern approaches usually compromise spatial resolution to achieve real-time inference speed, which leads to poor performance. In this paper, we address this dilemma with a novel Bilateral Segmentation Network (BiSeNet). We first design a Spatial Path with a small stride to preserve the spatial information and generate high-resolution features. Meanwhile, a Context Path with a fast downsampling strategy is employed to obtain sufficient receptive field. On top of the two paths, we introduce a new Feature Fusion Module to combine features efficiently. The proposed architecture makes a right balance between the speed and segmentation performance on Cityscapes, CamVid, and COCO-Stuff datasets. Specifically, for a 2048x1024 input, we achieve 68.4% Mean IOU on the Cityscapes test dataset with speed of 105 FPS on one NVIDIA Titan XP card, which is significantly faster than the existing methods with comparable performance.

Code Implementations21 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes