CVNov 1, 2022

Leveraging commonsense for object localisation in partial scenes

arXiv:2211.00562v15 citationsh-index: 45
Originality Incremental advance
AI Analysis

This addresses the problem of estimating object positions in incomplete 3D scans for applications like robotics or augmented reality, representing an incremental advance with specific gains.

The paper tackles object localization in partial 3D scenes by proposing a novel scene representation called Directed Spatial Commonsense Graph (D-SCG) and a Graph Neural Network with attentional message passing, achieving a 5.9% improvement in localization accuracy and 8x faster training speed on Partial ScanNet.

We propose an end-to-end solution to address the problem of object localisation in partial scenes, where we aim to estimate the position of an object in an unknown area given only a partial 3D scan of the scene. We propose a novel scene representation to facilitate the geometric reasoning, Directed Spatial Commonsense Graph (D-SCG), a spatial scene graph that is enriched with additional concept nodes from a commonsense knowledge base. Specifically, the nodes of D-SCG represent the scene objects and the edges are their relative positions. Each object node is then connected via different commonsense relationships to a set of concept nodes. With the proposed graph-based scene representation, we estimate the unknown position of the target object using a Graph Neural Network that implements a novel attentional message passing mechanism. The network first predicts the relative positions between the target object and each visible object by learning a rich representation of the objects via aggregating both the object nodes and the concept nodes in D-SCG. These relative positions then are merged to obtain the final position. We evaluate our method using Partial ScanNet, improving the state-of-the-art by 5.9% in terms of the localisation accuracy at a 8x faster training speed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes