Barbara Solenthaler

h-index25

13papers

246citations

Novelty54%

AI Score42

Ranked #62,382 of 194,257 authors (top 32%)#21,450 in CV (top 36%)

13 Papers

6.8CVFeb 28, 2023Code

Learning to Estimate Single-View Volumetric Flow Motions without 3D Supervision

Aleksandra Franz, Barbara Solenthaler, Nils Thuerey

We address the challenging problem of jointly inferring the 3D flow and volumetric densities moving in a fluid from a monocular input video with a deep neural network. Despite the complexity of this task, we show that it is possible to train the corresponding networks without requiring any 3D ground truth for training. In the absence of ground truth data we can train our model with observations from real-world capture setups instead of relying on synthetic reconstructions. We make this unsupervised training approach possible by first generating an initial prototype volume which is then moved and transported over time without the need for volumetric supervision. Our approach relies purely on image-based losses, an adversarial discriminator network, and regularization. Our method can estimate long-term sequences in a stable manner, while achieving closely matching targets for inputs such as rising smoke plumes.

1.5CVNov 27, 2023

Spatially Adaptive Cloth Regression with Implicit Neural Representations

Lei Shu, Vinicius Azevedo, Barbara Solenthaler et al.

The accurate representation of fine-detailed cloth wrinkles poses significant challenges in computer graphics. The inherently non-uniform structure of cloth wrinkles mandates the employment of intricate discretization strategies, which are frequently characterized by high computational demands and complex methodologies. Addressing this, the research introduced in this paper elucidates a novel anisotropic cloth regression technique that capitalizes on the potential of implicit neural representations of surfaces. Our first core contribution is an innovative mesh-free sampling approach, crafted to reduce the reliance on traditional mesh structures, thereby offering greater flexibility and accuracy in capturing fine cloth details. Our second contribution is a novel adversarial training scheme, which is designed meticulously to strike a harmonious balance between the sampling and simulation objectives. The adversarial approach ensures that the wrinkles are represented with high fidelity, while also maintaining computational efficiency. Our results showcase through various cloth-object interaction scenarios that our method, given the same memory constraints, consistently surpasses traditional discrete representations, particularly when modelling highly-detailed localized wrinkles.

6.0CVJan 20

FastGHA: Generalized Few-Shot 3D Gaussian Head Avatars with Real-Time Animation

Xinya Ji, Sebastian Weiss, Manuel Kansy et al.

Despite recent progress in 3D Gaussian-based head avatar modeling, efficiently generating high fidelity avatars remains a challenge. Current methods typically rely on extensive multi-view capture setups or monocular videos with per-identity optimization during inference, limiting their scalability and ease of use on unseen subjects. To overcome these efficiency drawbacks, we propose \OURS, a feed-forward method to generate high-quality Gaussian head avatars from only a few input images while supporting real-time animation. Our approach directly learns a per-pixel Gaussian representation from the input images, and aggregates multi-view information using a transformer-based encoder that fuses image features from both DINOv3 and Stable Diffusion VAE. For real-time animation, we extend the explicit Gaussian representations with per-Gaussian features and introduce a lightweight MLP-based dynamic network to predict 3D Gaussian deformations from expression codes. Furthermore, to enhance geometric smoothness of the 3D head, we employ point maps from a pre-trained large reconstruction model as geometry supervision. Experiments show that our approach significantly outperforms existing methods in both rendering quality and inference efficiency, while supporting real-time dynamic avatar animation.

8.7CVFeb 29, 2024

Learning a Generalized Physical Face Model From Data

Lingchen Yang, Gaspard Zoss, Prashanth Chandran et al.

Physically-based simulation is a powerful approach for 3D facial animation as the resulting deformations are governed by physical constraints, allowing to easily resolve self-collisions, respond to external forces and perform realistic anatomy edits. Today's methods are data-driven, where the actuations for finite elements are inferred from captured skin geometry. Unfortunately, these approaches have not been widely adopted due to the complexity of initializing the material space and learning the deformation model for each character separately, which often requires a skilled artist followed by lengthy network training. In this work, we aim to make physics-based facial animation more accessible by proposing a generalized physical face model that we learn from a large 3D face dataset. Once trained, our model can be quickly fit to any unseen identity and produce a ready-to-animate physical face model automatically. Fitting is as easy as providing a single 3D face scan, or even a single face image. After fitting, we offer intuitive animation controls, as well as the ability to retarget animations across characters. All the while, the resulting animations allow for physical effects like collision avoidance, gravity, paralysis, bone reshaping and more.

5.2CVJan 27, 2024

An Implicit Physical Face Model Driven by Expression and Style

Lingchen Yang, Gaspard Zoss, Prashanth Chandran et al.

3D facial animation is often produced by manipulating facial deformation models (or rigs), that are traditionally parameterized by expression controls. A key component that is usually overlooked is expression 'style', as in, how a particular expression is performed. Although it is common to define a semantic basis of expressions that characters can perform, most characters perform each expression in their own style. To date, style is usually entangled with the expression, and it is not possible to transfer the style of one character to another when considering facial animation. We present a new face model, based on a data-driven implicit neural physics model, that can be driven by both expression and style separately. At the core, we present a framework for learning implicit physics-based actuations for multiple subjects simultaneously, trained on a few arbitrary performance capture sequences from a small set of identities. Once trained, our method allows generalized physics-based facial animation for any of the trained identities, extending to unseen performances. Furthermore, it grants control over the animation style, enabling style transfer from one character to another or blending styles of different characters. Lastly, as a physics-based model, it is capable of synthesizing physical effects, such as collision handling, setting our method apart from conventional approaches.

6.2CVJan 15, 2025

Joint Learning of Depth and Appearance for Portrait Image Animation

Xinya Ji, Gaspard Zoss, Prashanth Chandran et al.

2D portrait animation has experienced significant advancements in recent years. Much research has utilized the prior knowledge embedded in large generative diffusion models to enhance high-quality image manipulation. However, most methods only focus on generating RGB images as output, and the co-generation of consistent visual plus 3D output remains largely under-explored. In our work, we propose to jointly learn the visual appearance and depth simultaneously in a diffusion-based portrait image generator. Our method embraces the end-to-end diffusion paradigm and introduces a new architecture suitable for learning this conditional joint distribution, consisting of a reference network and a channel-expanded diffusion backbone. Once trained, our framework can be efficiently adapted to various downstream applications, such as facial depth-to-image and image-to-depth generation, portrait relighting, and audio-driven talking head animation with consistent 3D output.

12.8CVJan 26, 2024

Implicit Neural Representation for Physics-driven Actuated Soft Bodies

Lingchen Yang, Byungsoo Kim, Gaspard Zoss et al.

Active soft bodies can affect their shape through an internal actuation mechanism that induces a deformation. Similar to recent work, this paper utilizes a differentiable, quasi-static, and physics-based simulation layer to optimize for actuation signals parameterized by neural networks. Our key contribution is a general and implicit formulation to control active soft bodies by defining a function that enables a continuous mapping from a spatial point in the material space to the actuation value. This property allows us to capture the signal's dominant frequencies, making the method discretization agnostic and widely applicable. We extend our implicit model to mandible kinematics for the particular case of facial animation and show that we can reliably reproduce facial expressions captured with high-quality capture systems. We apply the method to volumetric soft bodies, human poses, and facial expressions, demonstrating artist-friendly properties, such as simple control over the latent space and resolution invariance at test time.

8.7CVApr 13, 2021Code

Global Transport for Fluid Reconstruction with Learned Self-Supervision

Aleksandra Franz, Barbara Solenthaler, Nils Thuerey

We propose a novel method to reconstruct volumetric flows from sparse views via a global transport formulation. Instead of obtaining the space-time function of the observations, we reconstruct its motion based on a single initial state. In addition we introduce a learned self-supervision that constrains observations from unseen angles. These visual constraints are coupled via the transport constraints and a differentiable rendering step to arrive at a robust end-to-end reconstruction algorithm. This makes the reconstruction of highly realistic flow motions possible, even from only a single input view. We show with a variety of synthetic and real flows that the proposed global reconstruction of the transport process yields an improved reconstruction of the fluid motion.

9.2GRMay 2, 2020Code

Lagrangian Neural Style Transfer for Fluids

Byungsoo Kim, Vinicius C. Azevedo, Markus Gross et al.

Artistically controlling the shape, motion and appearance of fluid simulations pose major challenges in visual effects production. In this paper, we present a neural style transfer approach from images to 3D fluids formulated in a Lagrangian viewpoint. Using particles for style transfer has unique benefits compared to grid-based techniques. Attributes are stored on the particles and hence are trivially transported by the particle motion. This intrinsically ensures temporal consistency of the optimized stylized structure and notably improves the resulting quality. Simultaneously, the expensive, recursive alignment of stylization velocity fields of grid approaches is unnecessary, reducing the computation time to less than an hour and rendering neural flow stylization practical in production settings. Moreover, the Lagrangian representation improves artistic control as it allows for multi-fluid stylization and consistent color transfer from images, and the generality of the method enables stylization of smoke and liquids likewise.

12.2GRMar 12, 2020

Latent Space Subdivision: Stable and Controllable Time Predictions for Fluid Flow

Steffen Wiewel, Byungsoo Kim, Vinicius C. Azevedo et al.

We propose an end-to-end trained neural networkarchitecture to robustly predict the complex dynamics of fluid flows with high temporal stability. We focus on single-phase smoke simulations in 2D and 3D based on the incompressible Navier-Stokes (NS) equations, which are relevant for a wide range of practical problems. To achieve stable predictions for long-term flow sequences, a convolutional neural network (CNN) is trained for spatial compression in combination with a temporal prediction network that consists of stacked Long Short-Term Memory (LSTM) layers. Our core contribution is a novel latent space subdivision (LSS) to separate the respective input quantities into individual parts of the encoded latent space domain. This allows to distinctively alter the encoded quantities without interfering with the remaining latent space values and hence maximizes external control. By selectively overwriting parts of the predicted latent space points, our proposed method is capable to robustly predict long-term sequences of complex physics problems. In addition, we highlight the benefits of a recurrent training on the latent space creation, which is performed by the spatial compression network.

10.3LGDec 18, 2019

Frequency-Aware Reconstruction of Fluid Simulations with Generative Networks

Simon Biland, Vinicius C. Azevedo, Byungsoo Kim et al.

Convolutional neural networks were recently employed to fully reconstruct fluid simulation data from a set of reduced parameters. However, since (de-)convolutions traditionally trained with supervised L1-loss functions do not discriminate between low and high frequencies in the data, the error is not minimized efficiently for higher bands. This directly correlates with the quality of the perceived results, since missing high frequency details are easily noticeable. In this paper, we analyze the reconstruction quality of generative networks and present a frequency-aware loss function that is able to focus on specific bands of the dataset during training time. We show that our approach improves reconstruction quality of fluid simulation data in mid-frequency bands, yielding perceptually better results while requiring comparable training time.

6.6GRMay 17, 2019

Transport-Based Neural Style Transfer for Smoke Simulations

Byungsoo Kim, Vinicius C. Azevedo, Markus Gross et al.

Artistically controlling fluids has always been a challenging task. Optimization techniques rely on approximating simulation states towards target velocity or density field configurations, which are often handcrafted by artists to indirectly control smoke dynamics. Patch synthesis techniques transfer image textures or simulation features to a target flow field. However, these are either limited to adding structural patterns or augmenting coarse flows with turbulent structures, and hence cannot capture the full spectrum of different styles and semantically complex structures. In this paper, we propose the first Transport-based Neural Style Transfer (TNST) algorithm for volumetric smoke data. Our method is able to transfer features from natural images to smoke simulations, enabling general content-aware manipulations ranging from simple patterns to intricate motifs. The proposed algorithm is physically inspired, since it computes the density transport from a source input smoke to a desired target configuration. Our transport-based approach allows direct control over the divergence of the stylization velocity field by optimizing incompressible and irrotational potentials that transport smoke towards stylization. Temporal consistency is ensured by transporting and aligning subsequent stylized velocities, and 3D reconstructions are computed by seamlessly merging stylizations from different camera viewpoints.

1.2CYJun 7, 2018

Ten Years of Research on Intelligent Educational Games for Learning Spelling and Mathematics

Barbara Solenthaler, Severin Klingler, Tanja Käser et al.

In this article, we present our findings from ten years of research on intelligent educational games. We discuss the architecture of our training environments for learning spelling and mathematics, and specifically focus on the representation of the content and the controller that enables personalized trainings. We first show the multi-modal representation that reroutes information through multiple perceptual cues and discuss the game structure. We then present the data-driven student model that is used for a personalized, adaptive presentation of the content. We further leverage machine learning for analytics and visualization tools targeted at teachers and experts. A large data set consisting of training sessions of more than 20,000 children allows statistical interpretations and insights into the nature of learning.