DS-PASS: Detail-Sensitive Panoramic Annular Semantic Segmentation through SwaftNet for Surrounding Sensing
This addresses the limitation of existing segmentation methods designed for narrow field-of-view cameras, enabling better scene interpretation for autonomous transportation and robotics, though it is incremental as it adapts existing networks rather than introducing a new paradigm.
The paper tackles the problem of semantic segmentation for panoramic images in autonomous systems by proposing DS-PASS, a framework that adapts conventional networks to panoramic views using SwaftNet with attention-based connections, achieving real-time performance and outperforming state-of-the-art efficient networks on their extended PASS dataset.
Semantically interpreting the traffic scene is crucial for autonomous transportation and robotics systems. However, state-of-the-art semantic segmentation pipelines are dominantly designed to work with pinhole cameras and train with narrow Field-of-View (FoV) images. In this sense, the perception capacity is severely limited to offer higher-level confidence for upstream navigation tasks. In this paper, we propose a network adaptation framework to achieve Panoramic Annular Semantic Segmentation (PASS), which allows to re-use conventional pinhole-view image datasets, enabling modern segmentation networks to comfortably adapt to panoramic images. Specifically, we adapt our proposed SwaftNet to enhance the sensitivity to details by implementing attention-based lateral connections between the detail-critical encoder layers and the context-critical decoder layers. We benchmark the performance of efficient segmenters on panoramic segmentation with our extended PASS dataset, demonstrating that the proposed real-time SwaftNet outperforms state-of-the-art efficient networks. Furthermore, we assess real-world performance when deploying the Detail-Sensitive PASS (DS-PASS) system on a mobile robot and an instrumented vehicle, as well as the benefit of panoramic semantics for visual odometry, showing the robustness and potential to support diverse navigational applications.