CVSep 24, 2025

PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation

arXiv:2509.20358v229 citationsh-index: 17
Originality Incremental advance
AI Analysis

This addresses the need for more realistic and controllable video generation in computer vision and graphics, though it is incremental as it builds on existing diffusion models and physics simulators.

The paper tackles the problem of video generation models lacking physical plausibility and 3D controllability by introducing PhysCtrl, a framework that generates physics-grounded motion trajectories for image-to-video models, resulting in high-fidelity, controllable videos that outperform existing methods in visual quality and physical plausibility.

Existing video generation models excel at producing photo-realistic videos from text or images, but often lack physical plausibility and 3D controllability. To overcome these limitations, we introduce PhysCtrl, a novel framework for physics-grounded image-to-video generation with physical parameters and force control. At its core is a generative physics network that learns the distribution of physical dynamics across four materials (elastic, sand, plasticine, and rigid) via a diffusion model conditioned on physics parameters and applied forces. We represent physical dynamics as 3D point trajectories and train on a large-scale synthetic dataset of 550K animations generated by physics simulators. We enhance the diffusion model with a novel spatiotemporal attention block that emulates particle interactions and incorporates physics-based constraints during training to enforce physical plausibility. Experiments show that PhysCtrl generates realistic, physics-grounded motion trajectories which, when used to drive image-to-video models, yield high-fidelity, controllable videos that outperform existing methods in both visual quality and physical plausibility. Project Page: https://cwchenwang.github.io/physctrl

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes