Hierarchical Relational Inference
This addresses physical reasoning for AI systems, but it is incremental as it builds on prior unsupervised learning approaches.
The paper tackles the problem of common-sense physical reasoning by modeling objects as hierarchies of parts to capture complex behaviors, and it improves over a strong baseline in modeling synthetic and real-world videos.
Common-sense physical reasoning in the real world requires learning about the interactions of objects and their dynamics. The notion of an abstract object, however, encompasses a wide variety of physical objects that differ greatly in terms of the complex behaviors they support. To address this, we propose a novel approach to physical reasoning that models objects as hierarchies of parts that may locally behave separately, but also act more globally as a single whole. Unlike prior approaches, our method learns in an unsupervised fashion directly from raw visual images to discover objects, parts, and their relations. It explicitly distinguishes multiple levels of abstraction and improves over a strong baseline at modeling synthetic and real-world videos.