CVMar 26, 2023

You Only Segment Once: Towards Real-Time Panoptic Segmentation

arXiv:2303.14651v181 citationsh-index: 31Has Code
Originality Incremental advance
AI Analysis

This addresses the need for efficient panoptic segmentation in applications like autonomous driving, though it is incremental as it builds on existing methods with speed improvements.

The paper tackles real-time panoptic segmentation by proposing YOSO, a framework that predicts masks via dynamic convolutions, achieving competitive performance with results like 46.4 PQ and 45.6 FPS on COCO.

In this paper, we propose YOSO, a real-time panoptic segmentation framework. YOSO predicts masks via dynamic convolutions between panoptic kernels and image feature maps, in which you only need to segment once for both instance and semantic segmentation tasks. To reduce the computational overhead, we design a feature pyramid aggregator for the feature map extraction, and a separable dynamic decoder for the panoptic kernel generation. The aggregator re-parameterizes interpolation-first modules in a convolution-first way, which significantly speeds up the pipeline without any additional costs. The decoder performs multi-head cross-attention via separable dynamic convolution for better efficiency and accuracy. To the best of our knowledge, YOSO is the first real-time panoptic segmentation framework that delivers competitive performance compared to state-of-the-art models. Specifically, YOSO achieves 46.4 PQ, 45.6 FPS on COCO; 52.5 PQ, 22.6 FPS on Cityscapes; 38.0 PQ, 35.4 FPS on ADE20K; and 34.1 PQ, 7.1 FPS on Mapillary Vistas. Code is available at https://github.com/hujiecpp/YOSO.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes