CVSep 11, 2024
Phy124: Fast Physics-Driven 4D Content Generation from a Single ImageJiajing Lin, Zhenzhong Wang, Yongjie Hou et al.
4D content generation focuses on creating dynamic 3D objects that change over time. Existing methods primarily rely on pre-trained video diffusion models, utilizing sampling processes or reference videos. However, these approaches face significant challenges. Firstly, the generated 4D content often fails to adhere to real-world physics since video diffusion models do not incorporate physical priors. Secondly, the extensive sampling process and the large number of parameters in diffusion models result in exceedingly time-consuming generation processes. To address these issues, we introduce Phy124, a novel, fast, and physics-driven method for controllable 4D content generation from a single image. Phy124 integrates physical simulation directly into the 4D generation process, ensuring that the resulting 4D content adheres to natural physical laws. Phy124 also eliminates the use of diffusion models during the 4D dynamics generation phase, significantly speeding up the process. Phy124 allows for the control of 4D dynamics, including movement speed and direction, by manipulating external forces. Extensive experiments demonstrate that Phy124 generates high-fidelity 4D content with significantly reduced inference times, achieving stateof-the-art performance. The code and generated 4D content are available at the provided link: https://anonymous.4open.science/r/BBF2/.
CVNov 25, 2024
Phys4DGen: Physics-Compliant 4D Generation with Multi-Material Composition PerceptionJiajing Lin, Zhenzhong Wang, Dejun Xu et al.
4D content generation aims to create dynamically evolving 3D content that responds to specific input objects such as images or 3D representations. Current approaches typically incorporate physical priors to animate 3D representations, but these methods suffer from significant limitations: they not only require users lacking physics expertise to manually specify material properties but also struggle to effectively handle the generation of multi-material composite objects. To address these challenges, we propose Phys4DGen, a novel 4D generation framework that integrates multi-material composition perception with physical simulation. The framework achieves automated, physically plausible 4D generation through three innovative modules: first, the 3D Material Grouping module partitions heterogeneous material regions on 3D representations' surfaces via semantic segmentation; second, the Internal Physical Structure Discovery module constructs the mechanical structure of object interiors; finally, we distill physical prior knowledge from multimodal large language models to enable rapid and automatic material properties identification for both objects' surfaces and interiors. Experiments on both synthetic and real-world datasets demonstrate that Phys4DGen can generate high-fidelity 4D content with physical realism in open-world scenarios, significantly outperforming state-of-the-art methods.
CVAug 19, 2025
VisionLaw: Inferring Interpretable Intrinsic Dynamics from Visual Observations via Bilevel OptimizationJiajing Lin, Shu Jiang, Qingyuan Zeng et al.
The intrinsic dynamics of an object governs its physical behavior in the real world, playing a critical role in enabling physically plausible interactive simulation with 3D assets. Existing methods have attempted to infer the intrinsic dynamics of objects from visual observations, but generally face two major challenges: one line of work relies on manually defined constitutive priors, making it difficult to generalize to complex scenarios; the other models intrinsic dynamics using neural networks, resulting in limited interpretability and poor generalization. To address these challenges, we propose VisionLaw, a bilevel optimization framework that infers interpretable expressions of intrinsic dynamics from visual observations. At the upper level, we introduce an LLMs-driven decoupled constitutive evolution strategy, where LLMs are prompted as a knowledgeable physics expert to generate and revise constitutive laws, with a built-in decoupling mechanism that substantially reduces the search complexity of LLMs. At the lower level, we introduce a vision-guided constitutive evaluation mechanism, which utilizes visual simulation to evaluate the consistency between the generated constitutive law and the underlying intrinsic dynamics, thereby guiding the upper-level evolution. Experiments on both synthetic and real-world datasets demonstrate that VisionLaw can effectively infer interpretable intrinsic dynamics from visual observations. It significantly outperforms existing state-of-the-art methods and exhibits strong generalization for interactive simulation in novel scenarios.
CRAug 10, 2025
Fading the Digital Ink: A Universal Black-Box Attack Framework for 3DGS Watermarking SystemsQingyuan Zeng, Shu Jiang, Jiajing Lin et al.
With the rise of 3D Gaussian Splatting (3DGS), a variety of digital watermarking techniques, embedding either 1D bitstreams or 2D images, are used for copyright protection. However, the robustness of these watermarking techniques against potential attacks remains underexplored. This paper introduces the first universal black-box attack framework, the Group-based Multi-objective Evolutionary Attack (GMEA), designed to challenge these watermarking systems. We formulate the attack as a large-scale multi-objective optimization problem, balancing watermark removal with visual quality. In a black-box setting, we introduce an indirect objective function that blinds the watermark detector by minimizing the standard deviation of features extracted by a convolutional network, thus rendering the feature maps uninformative. To manage the vast search space of 3DGS models, we employ a group-based optimization strategy to partition the model into multiple, independent sub-optimization problems. Experiments demonstrate that our framework effectively removes both 1D and 2D watermarks from mainstream 3DGS watermarking methods while maintaining high visual fidelity. This work reveals critical vulnerabilities in existing 3DGS copyright protection schemes and calls for the development of more robust watermarking systems.
AIDec 20, 2024
Enhancing Large-scale UAV Route Planing with Global and Local Features via Reinforcement Graph FusionTao Zhou, Kai Ye, Zeyu Shi et al.
Numerous remarkable advancements have been made in accuracy, speed, and parallelism for solving the Unmanned Aerial Vehicle Route Planing (UAVRP). However, existing UAVRP solvers face challenges when attempting to scale effectively and efficiently for larger instances. In this paper, we present a generalization framework that enables current UAVRP solvers to robustly extend their capabilities to larger instances, accommodating up to 10,000 points, using widely recognized test sets. The UAVRP under a large number of patrol points is a typical large-scale TSP problem.Our proposed framework comprises three distinct steps. Firstly, we employ Delaunay triangulation to extract subgraphs from large instances while preserving global features. Secondly, we utilize an embedded TSP solver to obtain sub-results, followed by graph fusion. Finally, we implement a decoding strategy customizable to the user's requirements, resulting in high-quality solutions, complemented by a warming-up process for the heatmap. To demonstrate the flexibility of our approach, we integrate two representative TSP solvers into our framework and conduct a comprehensive comparative analysis against existing algorithms using large TSP benchmark datasets. The results unequivocally demonstrate that our framework efficiently scales existing TSP solvers to handle large instances and consistently outperforms state-of-the-art (SOTA) methods. Furthermore, since our proposed framework does not necessitate additional training or fine-tuning, we believe that its generality can significantly advance research on end-to-end UAVRP solvers, enabling the application of a broader range of methods to real-world scenarios.