44.9NAMay 30
Solver-in-the-Loop joint operator learning: fractional Laplace-Beltrami features for interface reconstructionYangyang Zheng, Huayi Wei, Shuhao Cao et al.
In this work, we propose a joint operator learning method for reconstructing images of conductivity coefficients from boundary data. Inspired by the idea of employing partial differential equation (PDE) solvers as preconditioners for this inverse problem, we investigate a ``solver-in-the-loop'' training mechanism. It allows the interaction of learnable parameters integrated in a PDE solver module and those in neural networks for reconstructing images. Specifically, we employ a fractional Laplace-Beltrami operator with a learnable fractional order, which transforms boundary data into high-dimensional features. These features then serve as input to a neural network, significantly improving reconstruction accuracy. For this purpose, a Learning-Automated FEM (LA-FEM) package, facilitating this ``solver-in-the-loop'' property, is developed with PyTorch as a backend. The new LA-FEM module conveniently allows the auto-differentiation regarding an objective function to freely propagate through the PDE solver from the forward problem and the coupled neural networks for the inverse problem.
CVDec 15, 2025
Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation ModelTeam Seedance, Heyi Chen, Siyan Chen et al.
Recent strides in video generation have paved the way for unified audio-visual generation. In this work, we present Seedance 1.5 pro, a foundational model engineered specifically for native, joint audio-video generation. Leveraging a dual-branch Diffusion Transformer architecture, the model integrates a cross-modal joint module with a specialized multi-stage data pipeline, achieving exceptional audio-visual synchronization and superior generation quality. To ensure practical utility, we implement meticulous post-training optimizations, including Supervised Fine-Tuning (SFT) on high-quality datasets and Reinforcement Learning from Human Feedback (RLHF) with multi-dimensional reward models. Furthermore, we introduce an acceleration framework that boosts inference speed by over 10X. Seedance 1.5 pro distinguishes itself through precise multilingual and dialect lip-syncing, dynamic cinematic camera control, and enhanced narrative coherence, positioning it as a robust engine for professional-grade content creation. Seedance 1.5 pro is now accessible on Volcano Engine at https://console.volcengine.com/ark/region:ark+cn-beijing/experience/vision?type=GenVideo.