Mapping the Unseen: Unified Promptable Panoptic Mapping with Dynamic Labeling using Foundation Models
This addresses the limitation of fixed labels in panoptic mapping for robots handling novel objects, representing a novel method for a known bottleneck.
The paper tackles the problem of semantic mapping in robotics by developing a unified promptable panoptic mapping approach that uses foundation models for dynamic labeling without retraining, achieving 0.61cm geometry reconstruction accuracy and 0.414 panoptic quality.
In robotics and computer vision, semantic mapping remains a critical challenge for machines to comprehend complex environments. Traditional panoptic mapping approaches are constrained by fixed labels, limiting their ability to handle novel objects. We present Unified Promptable Panoptic Mapping (UPPM), which leverages foundation models for dynamic labeling without additional training. UPPM is evaluated across three comprehensive levels: Segmentation-to-Map, Map-to-Map, and Segmentation-to-Segmentation. Results demonstrate UPPM attains exceptional geometry reconstruction accuracy (0.61cm on the Flat dataset), the highest panoptic quality (0.414), and better performance compared to state-of-the-art segmentation methods. Furthermore, ablation studies validate the contributions of unified semantics, custom NMS, and blurry frame filtering, with the custom NMS improving the completion ratio by 8.27% on the Flat dataset. UPPM demonstrates effective scene reconstruction with rich semantic labeling across diverse datasets.