ROCVMay 5, 2025

Estimating Commonsense Scene Composition on Belief Scene Graphs

arXiv:2505.02405v1h-index: 18ICRA
Originality Incremental advance
AI Analysis

This work addresses scene understanding for robotics or AI systems by extending Belief Scene Graphs, but it appears incremental as it builds on existing graph and neuro-symbolic methods.

The paper tackles the problem of commonsense scene composition by estimating spatial distributions of unseen objects in Belief Scene Graphs, using a framework with two variants of a Correlation Information model, and validates it on simulated and real-world indoor data.

This work establishes the concept of commonsense scene composition, with a focus on extending Belief Scene Graphs by estimating the spatial distribution of unseen objects. Specifically, the commonsense scene composition capability refers to the understanding of the spatial relationships among related objects in the scene, which in this article is modeled as a joint probability distribution for all possible locations of the semantic object class. The proposed framework includes two variants of a Correlation Information (CECI) model for learning probability distributions: (i) a baseline approach based on a Graph Convolutional Network, and (ii) a neuro-symbolic extension that integrates a spatial ontology based on Large Language Models (LLMs). Furthermore, this article provides a detailed description of the dataset generation process for such tasks. Finally, the framework has been validated through multiple runs on simulated data, as well as in a real-world indoor environment, demonstrating its ability to spatially interpret scenes across different room types.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes