LGROJun 28, 2024

Modeling the Real World with High-Density Visual Particle Dynamics

arXiv:2406.19800v19 citations
Originality Incremental advance
AI Analysis

This work addresses efficient world modeling for robotics, enabling better motion planning in tasks like box pushing and grasping, though it is incremental as it builds on existing particle dynamics approaches.

The paper tackles the problem of modeling physical dynamics in real scenes by introducing High-Density Visual Particle Dynamics (HD-VPD), which processes over 100K particles to emulate dynamics, achieving twice the speed with the same prediction quality and higher quality with 4x more particles compared to previous methods.

We present High-Density Visual Particle Dynamics (HD-VPD), a learned world model that can emulate the physical dynamics of real scenes by processing massive latent point clouds containing 100K+ particles. To enable efficiency at this scale, we introduce a novel family of Point Cloud Transformers (PCTs) called Interlacers leveraging intertwined linear-attention Performer layers and graph-based neighbour attention layers. We demonstrate the capabilities of HD-VPD by modeling the dynamics of high degree-of-freedom bi-manual robots with two RGB-D cameras. Compared to the previous graph neural network approach, our Interlacer dynamics is twice as fast with the same prediction quality, and can achieve higher quality using 4x as many particles. We illustrate how HD-VPD can evaluate motion plan quality with robotic box pushing and can grasping tasks. See videos and particle dynamics rendered by HD-VPD at https://sites.google.com/view/hd-vpd.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes