LGAIJul 7, 2025

Object-centric Denoising Diffusion Models for Physical Reasoning

arXiv:2507.04920v1h-index: 4
Originality Incremental advance
AI Analysis

This addresses a limitation in physical reasoning for machine learning, where existing autoregressive models cannot condition on later states, though it is incremental by applying diffusion models to this domain.

The paper tackles the problem of physical reasoning about multiple interacting objects by proposing an object-centric denoising diffusion model that can be conditioned on arbitrary time steps, enabling tasks with multiple conditions and handling varying object numbers and trajectory lengths during inference.

Reasoning about the trajectories of multiple, interacting objects is integral to physical reasoning tasks in machine learning. This involves conditions imposed on the objects at different time steps, for instance initial states or desired goal states. Existing approaches in physical reasoning generally rely on autoregressive modeling, which can only be conditioned on initial states, but not on later states. In fields such as planning for reinforcement learning, similar challenges are being addressed with denoising diffusion models. In this work, we propose an object-centric denoising diffusion model architecture for physical reasoning that is translation equivariant over time, permutation equivariant over objects, and can be conditioned on arbitrary time steps for arbitrary objects. We demonstrate how this model can solve tasks with multiple conditions and examine its performance when changing object numbers and trajectory lengths during inference.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes