CEApr 18

Watching Physics: the Generative Science of Matter and Motion

arXiv:2604.1684395.7h-index: 22
AI Analysis

For researchers in physics, engineering, and computer vision, this work establishes a framework to evaluate and improve the physical validity of generative models, turning them into scientific instruments for inference and design.

The paper demonstrates that generative video models can learn physically valid dynamics from images and video when coupled with experiments and high-fidelity simulations, but fail when internal state variables dominate. Using rubber compression, can crushing, and cardiac motion, they show regimes where visual learning succeeds (e.g., recovering surface strain) and fails.

Can we learn the physics of matter in motion directly from images and video--and trust it? Answering this question requires integrating experiments, physics-based simulation, and data across traditionally separate disciplines. Much of this knowledge is visual and temporal rather than textual: images and videos encode structure, dynamics, and causality that equations alone cannot fully capture. Recent generative models produce compelling visual content, yet they rely on observational data and often lack physical validity. Here we show that generative video models gain scientific value when they couple visual data with experiments and high-fidelity simulations. Using deformation mechanics as a testbed, we study three systems of increasing complexity--rubber compression, can crushing, and cardiac motion--and identify regimes in which visual learning succeeds, fails, and requires mechanistic supervision. When physics manifests in visible kinematics, generative models recover measurable quantities such as surface strain; when internal state variables dominate, visual plausibility no longer ensures physical admissibility. We propose that this convergence defines a new frontier, the Generative Sciences of Matter and Motion, which unifies Simulogenics, Physiogenics, and Materiogenics. These physics-grounded foundation models can turn visual generation into a scientific instrument for inference, prediction, and design of matter in motion.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes