CVJul 7, 2020

Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning

arXiv:2007.03169v114 citations
Originality Incremental advance
AI Analysis

This addresses the problem of recognizing individual objects in noisy 3D reconstructions for high-level intelligent tasks, offering a fast and scalable solution with incremental improvements over existing methods.

The paper tackles 3D instance segmentation for indoor environments by learning an embedding space that clusters object instances based on spatial and semantic information, achieving state-of-the-art performance on the ScanNet benchmark with competitive AP scores.

We propose spatial semantic embedding network (SSEN), a simple, yet efficient algorithm for 3D instance segmentation using deep metric learning. The raw 3D reconstruction of an indoor environment suffers from occlusions, noise, and is produced without any meaningful distinction between individual entities. For high-level intelligent tasks from a large scale scene, 3D instance segmentation recognizes individual instances of objects. We approach the instance segmentation by simply learning the correct embedding space that maps individual instances of objects into distinct clusters that reflect both spatial and semantic information. Unlike previous approaches that require complex pre-processing or post-processing, our implementation is compact and fast with competitive performance, maintaining scalability on large scenes with high resolution voxels. We demonstrate the state-of-the-art performance of our algorithm in the ScanNet 3D instance segmentation benchmark on AP score.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes