CVApr 2, 2018

Low-Latency Video Semantic Segmentation

arXiv:1804.00389v1177 citations
Originality Incremental advance
AI Analysis

This work addresses low-latency video segmentation for real-time applications such as autonomous driving, representing an incremental improvement over existing methods.

The paper tackled the challenge of achieving low-latency video semantic segmentation for applications like autonomous driving by developing a framework with a feature propagation module and adaptive scheduler, reducing latency from 360 ms to 119 ms while maintaining competitive performance on Cityscapes and CamVid.

Recent years have seen remarkable progress in semantic segmentation. Yet, it remains a challenging task to apply segmentation techniques to video-based applications. Specifically, the high throughput of video streams, the sheer cost of running fully convolutional networks, together with the low-latency requirements in many real-world applications, e.g. autonomous driving, present a significant challenge to the design of the video segmentation framework. To tackle this combined challenge, we develop a framework for video semantic segmentation, which incorporates two novel components: (1) a feature propagation module that adaptively fuses features over time via spatially variant convolution, thus reducing the cost of per-frame computation; and (2) an adaptive scheduler that dynamically allocate computation based on accuracy prediction. Both components work together to ensure low latency while maintaining high segmentation quality. On both Cityscapes and CamVid, the proposed framework obtained competitive performance compared to the state of the art, while substantially reducing the latency, from 360 ms to 119 ms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes