LGCVROJan 22, 2024

Scaling Face Interaction Graph Networks to Real World Scenes

DeepMind
arXiv:2401.11985v16 citationsh-index: 13
Originality Incremental advance
AI Analysis

This addresses a key bottleneck for applying learned simulators in robotics, engineering, and graphics where only perceptual data is available, representing an incremental advance.

The paper tackles the problem of scaling graph-based learned simulators to real-world scenes with many objects and perceptual inputs, by introducing a memory-efficient model and a perceptual interface using editable NeRFs, showing it uses substantially less memory while retaining accuracy and enabling application to real scenes from camera angles.

Accurately simulating real world object dynamics is essential for various applications such as robotics, engineering, graphics, and design. To better capture complex real dynamics such as contact and friction, learned simulators based on graph networks have recently shown great promise. However, applying these learned simulators to real scenes comes with two major challenges: first, scaling learned simulators to handle the complexity of real world scenes which can involve hundreds of objects each with complicated 3D shapes, and second, handling inputs from perception rather than 3D state information. Here we introduce a method which substantially reduces the memory required to run graph-based learned simulators. Based on this memory-efficient simulation model, we then present a perceptual interface in the form of editable NeRFs which can convert real-world scenes into a structured representation that can be processed by graph network simulator. We show that our method uses substantially less memory than previous graph-based simulators while retaining their accuracy, and that the simulators learned in synthetic environments can be applied to real world scenes captured from multiple camera angles. This paves the way for expanding the application of learned simulators to settings where only perceptual information is available at inference time.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes