LGCVSep 11, 2023

Learning Geometric Representations of Objects via Interaction

arXiv:2309.05346v1h-index: 12
Originality Incremental advance
AI Analysis

This addresses representation learning for robotics or AI agents interacting with objects, but appears incremental as it builds on existing unsupervised learning frameworks.

The paper tackles the problem of learning geometric representations of agent and object locations from unstructured observations using only agent actions as supervision, and shows it outperforms vision-based methods and enables efficient reinforcement learning.

We address the problem of learning representations from observations of a scene involving an agent and an external object the agent interacts with. To this end, we propose a representation learning framework extracting the location in physical space of both the agent and the object from unstructured observations of arbitrary nature. Our framework relies on the actions performed by the agent as the only source of supervision, while assuming that the object is displaced by the agent via unknown dynamics. We provide a theoretical foundation and formally prove that an ideal learner is guaranteed to infer an isometric representation, disentangling the agent from the object and correctly extracting their locations. We evaluate empirically our framework on a variety of scenarios, showing that it outperforms vision-based approaches such as a state-of-the-art keypoint extractor. We moreover demonstrate how the extracted representations enable the agent to solve downstream tasks via reinforcement learning in an efficient manner.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes