ROCVLGNov 14, 2023

SceneScore: Learning a Cost Function for Object Arrangement

arXiv:2311.08530v16 citationsh-index: 6
Originality Incremental advance
AI Analysis

This addresses the challenge of enabling robots to create human-like arrangements without environment interaction or human supervision, though it appears incremental as it builds on existing energy-based models and graph neural networks.

The paper tackles the problem of evaluating object arrangements for robots by learning a cost function called SceneScore from example images, enabling tasks like predicting poses for missing objects and generalizing to novel objects with semantic features.

Arranging objects correctly is a key capability for robots which unlocks a wide range of useful tasks. A prerequisite for creating successful arrangements is the ability to evaluate the desirability of a given arrangement. Our method "SceneScore" learns a cost function for arrangements, such that desirable, human-like arrangements have a low cost. We learn the distribution of training arrangements offline using an energy-based model, solely from example images without requiring environment interaction or human supervision. Our model is represented by a graph neural network which learns object-object relations, using graphs constructed from images. Experiments demonstrate that the learned cost function can be used to predict poses for missing objects, generalise to novel objects using semantic features, and can be composed with other cost functions to satisfy constraints at inference time.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes