Physical Primitive Decomposition
This work addresses the challenge of developing interpretable representations for intelligent agents to interact with the world, but it is incremental as it builds on existing decomposition methods by incorporating physical signals.
The paper tackles the problem of understanding objects through their physical and geometric components by proposing a model that learns physical primitives from both appearance and physical behaviors, achieving good performance on block towers and tools in synthetic and real scenarios.
Objects are made of parts, each with distinct geometry, physics, functionality, and affordances. Developing such a distributed, physical, interpretable representation of objects will facilitate intelligent agents to better explore and interact with the world. In this paper, we study physical primitive decomposition---understanding an object through its components, each with physical and geometric attributes. As annotated data for object parts and physics are rare, we propose a novel formulation that learns physical primitives by explaining both an object's appearance and its behaviors in physical events. Our model performs well on block towers and tools in both synthetic and real scenarios; we also demonstrate that visual and physical observations often provide complementary signals. We further present ablation and behavioral studies to better understand our model and contrast it with human performance.