CVNov 18, 2024

MGNiceNet: Unified Monocular Geometric Scene Understanding

arXiv:2411.11466v2h-index: 6Has CodeACCV
Originality Incremental advance
AI Analysis

This work addresses the need for efficient scene understanding in autonomous driving, though it is incremental as it builds on existing real-time panoptic segmentation methods.

The paper tackles the problem of real-time monocular geometric scene understanding for autonomous vehicles by unifying panoptic segmentation and self-supervised depth estimation, achieving state-of-the-art results on Cityscapes and KITTI datasets compared to other real-time methods.

Monocular geometric scene understanding combines panoptic segmentation and self-supervised depth estimation, focusing on real-time application in autonomous vehicles. We introduce MGNiceNet, a unified approach that uses a linked kernel formulation for panoptic segmentation and self-supervised depth estimation. MGNiceNet is based on the state-of-the-art real-time panoptic segmentation method RT-K-Net and extends the architecture to cover both panoptic segmentation and self-supervised monocular depth estimation. To this end, we introduce a tightly coupled self-supervised depth estimation predictor that explicitly uses information from the panoptic path for depth prediction. Furthermore, we introduce a panoptic-guided motion masking method to improve depth estimation without relying on video panoptic segmentation annotations. We evaluate our method on two popular autonomous driving datasets, Cityscapes and KITTI. Our model shows state-of-the-art results compared to other real-time methods and closes the gap to computationally more demanding methods. Source code and trained models are available at https://github.com/markusschoen/MGNiceNet.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes