LGSep 8, 2022Code
NeuralFMU: Presenting a workflow for integrating hybrid NeuralODEs into real world applicationsTobias Thummerer, Johannes Stoljar, Lars Mikelsons
The term NeuralODE describes the structural combination of an Artifical Neural Network (ANN) and a numerical solver for Ordinary Differential Equations (ODEs), the former acts as the right-hand side of the ODE to be solved. This concept was further extended by a black-box model in the form of a Functional Mock-up Unit (FMU) to obtain a subclass of NeuralODEs, named NeuralFMUs. The resulting structure features the advantages of first-principle and data-driven modeling approaches in one single simulation model: A higher prediction accuracy compared to conventional First Principle Models (FPMs), while also a lower training effort compared to purely data-driven models. We present an intuitive workflow to setup and use NeuralFMUs, enabling the encapsulation and reuse of existing conventional models exported from common modeling tools. Moreover, we exemplify this concept by deploying a NeuralFMU for a consumption simulation based on a Vehicle Longitudinal Dynamics Model (VLDM), which is a typical use case in automotive industry. Related challenges that are often neglected in scientific use cases, like real measurements (e.g. noise), an unknown system state or high-frequent discontinuities, are handled in this contribution. For the aim to build a hybrid model with a higher prediction quality than the original FPM, we briefly highlight two open-source libraries: FMI.jl for integrating FMUs into the Julia programming environment, as well as an extension to this library called FMIFlux.jl, that allows for the integration of FMUs into a neural network topology to finally obtain a NeuralFMU.
36.7NEMay 27
Performance and Explainability Requirements of Evolutionary Algorithms in Real-World Physics-Informed OptimizationHelena Stegherr, Michael Heider, Nils Meyer et al.
Evolutionary computation offers a variety of tools to solve complex real-world optimization problems. However, research often focuses on smaller, simplified problems and optimization algorithms that sometimes miss expectations in real-world scenarios. Additionally, trust in the applied algorithm and the solutions it provides is often essential in such settings, but requires an understanding of the search process itself. This leads to evolutionary computation often not being seriously considered by practitioners in many application contexts, among them physics-based modeling. In this article, techniques from evolutionary computation are detailed that can alleviate these problems. First, five real-world physics-based optimization problems are introduced and described by domain experts. For each of these, the requirements for the evolutionary algorithm regarding performance and explainability to increase trust and usability are presented. We found that all domain experts expect fast convergence to a good solution and want some explanations for how the results were formed, while other requirements strongly depend on the respective problem. Finally, we present existing approaches that can be leveraged to improve those aspects of evolutionary algorithms but have to our knowledge never been employed in complex real-world scenarios. This implies a gap between both domains that needs to be closed to exploit the full potential of evolutionary computation.
LGFeb 7, 2023
Eigen-informed NeuralODEs: Dealing with stability and convergence issues of NeuralODEsTobias Thummerer, Lars Mikelsons
Using vanilla NeuralODEs to model large and/or complex systems often fails due two reasons: Stability and convergence. NeuralODEs are capable of describing stable as well as instable dynamic systems. Selecting an appropriate numerical solver is not trivial, because NeuralODE properties change during training. If the NeuralODE becomes more stiff, a suboptimal solver may need to perform very small solver steps, which significantly slows down the training process. If the NeuralODE becomes to instable, the numerical solver might not be able to solve it at all, which causes the training process to terminate. Often, this is tackled by choosing a computational expensive solver that is robust to instable and stiff ODEs, but at the cost of a significantly decreased training performance. Our method on the other hand, allows to enforce ODE properties that fit a specific solver or application-related boundary conditions. Concerning the convergence behavior, NeuralODEs often tend to run into local minima, especially if the system to be learned is highly dynamic and/or oscillating over multiple periods. Because of the vanishing gradient at a local minimum, the NeuralODE is often not capable of leaving it and converge to the right solution. We present a technique to add knowledge of ODE properties based on eigenvalues - like (partly) stability, oscillation capability, frequency, damping and/or stiffness - to the training objective of a NeuralODE. We exemplify our method at a linear as well as a nonlinear system model and show, that the presented training process is far more robust against local minima, instabilities and sparse data samples and improves training convergence and performance.
CVJul 15, 2024
InsertDiffusion: Identity Preserving Visualization of Objects through a Training-Free Diffusion ArchitecturePhillip Mueller, Jannik Wiese, Ioan Craciun et al.
Recent advancements in image synthesis are fueled by the advent of large-scale diffusion models. Yet, integrating realistic object visualizations seamlessly into new or existing backgrounds without extensive training remains a challenge. This paper introduces InsertDiffusion, a novel, training-free diffusion architecture that efficiently embeds objects into images while preserving their structural and identity characteristics. Our approach utilizes off-the-shelf generative models and eliminates the need for fine-tuning, making it ideal for rapid and adaptable visualizations in product design and marketing. We demonstrate superior performance over existing methods in terms of image realism and alignment with input conditions. By decomposing the generation task into independent steps, InsertDiffusion offers a scalable solution that extends the capabilities of diffusion models for practical applications, achieving high-quality visualizations that maintain the authenticity of the original objects.
LGJul 15, 2024
Exploring the Potentials and Challenges of Deep Generative Models in Product Design ConceptionPhillip Mueller, Lars Mikelsons
The synthesis of product design concepts stands at the crux of early-phase development processes for technical products, traditionally posing an intricate interdisciplinary challenge. The application of deep learning methods, particularly Deep Generative Models (DGMs), holds the promise of automating and streamlining manual iterations and therefore introducing heightened levels of innovation and efficiency. However, DGMs have yet to be widely adopted into the synthesis of product design concepts. This paper aims to explore the reasons behind this limited application and derive the requirements for successful integration of these technologies. We systematically analyze DGM-families (VAE, GAN, Diffusion, Transformer, Radiance Field), assessing their strengths, weaknesses, and general applicability for product design conception. Our objective is to provide insights that simplify the decision-making process for engineers, helping them determine which method might be most effective for their specific challenges. Recognizing the rapid evolution of this field, we hope that our analysis contributes to a fundamental understanding and guides practitioners towards the most promising approaches. This work seeks not only to illuminate current challenges but also to propose potential solutions, thereby offering a clear roadmap for leveraging DGMs in the realm of product design conception.
LGSep 9, 2021Code
NeuralFMU: Towards Structural Integration of FMUs into Neural NetworksTobias Thummerer, Josef Kircher, Lars Mikelsons
This paper covers two major subjects: First, the presentation of a new open-source library called FMI.jl for integrating FMI into the Julia programming environment by providing the possibility to load, parameterize and simulate FMUs. Further, an extension to this library called FMIFlux.jl is introduced, that allows the integration of FMUs into a neural network topology to obtain a NeuralFMU. This structural combination of an industry typical black-box model and a data-driven machine learning model combines the different advantages of both modeling approaches in one single development environment. This allows for the usage of advanced data driven modeling techniques for physical effects that are difficult to model based on first principles.
CVSep 25, 2024
GeoBiked: A Dataset with Geometric Features and Automated Labeling Techniques to Enable Deep Generative Models in Engineering DesignPhillip Mueller, Sebastian Mueller, Lars Mikelsons
We provide a dataset for enabling Deep Generative Models (DGMs) in engineering design and propose methods to automate data labeling by utilizing large-scale foundation models. GeoBiked is curated to contain 4 355 bicycle images, annotated with structural and technical features and is used to investigate two automated labeling techniques: The utilization of consolidated latent features (Hyperfeatures) from image-generation models to detect geometric correspondences (e.g. the position of the wheel center) in structural images and the generation of diverse text descriptions for structural images. GPT-4o, a vision-language-model (VLM), is instructed to analyze images and produce diverse descriptions aligned with the system-prompt. By representing technical images as Diffusion-Hyperfeatures, drawing geometric correspondences between them is possible. The detection accuracy of geometric points in unseen samples is improved by presenting multiple annotated source images. GPT-4o has sufficient capabilities to generate accurate descriptions of technical images. Grounding the generation only on images leads to diverse descriptions but causes hallucinations, while grounding it on categorical labels restricts the diversity. Using both as input balances creativity and accuracy. Successfully using Hyperfeatures for geometric correspondence suggests that this approach can be used for general point-detection and annotation tasks in technical images. Labeling such images with text descriptions using VLMs is possible, but dependent on the models detection capabilities, careful prompt-engineering and the selection of input information. Applying foundation models in engineering design is largely unexplored. We aim to bridge this gap with a dataset to explore training, finetuning and conditioning DGMs in this field and suggesting approaches to bootstrap foundation models to process technical images.
2.5HCApr 29
Persona-Based Process Design for Assistive Human-Robot Workplaces for Persons with DisabilitiesNils Mandischer, Daria Eckert and, Lars Mikelsons
Human-robot interaction is emerging as an important paradigm for integrating persons with disabilities into the workplace. While these systems can enable individuals to work, their design is mostly personalized, hindering widespread use beyond the individual user. The universal design paradigm is a central pillar of inclusive design, describing usability of systems by all. To incorporate universal design into process design for human-robot workplaces expert knowledge is required that is often not available. To simplify process design of human-robot workplaces, we propose a persona-based design approach. First, typical impairments prevalent in the workforce or particularly relevant for the processes are abstracted into personas with disabilities. The work process is subdivided into sequential actions. For each action and persona, strategies are developed to reach the action goal by a design thinking approach. The resulting actions are ordered by level of robot assistance, i.e. robot involvement, and implemented in a behavior tree. Therefore, the macro-behavior of the workplace may adapt to individual personas online. We demonstrate the method in a collaborative box folding process with a total of seven personas with disabilities. The persona-based process design shows promising results by generating more comprehensive process strategies while enabling adaptive behavior in the sense of universal design.
ROMay 16, 2024
Towards Consistent and Explainable Motion Prediction using Heterogeneous Graph AttentionTobias Demmler, Andreas Tamke, Thao Dang et al.
In autonomous driving, accurately interpreting the movements of other road users and leveraging this knowledge to forecast future trajectories is crucial. This is typically achieved through the integration of map data and tracked trajectories of various agents. Numerous methodologies combine this information into a singular embedding for each agent, which is then utilized to predict future behavior. However, these approaches have a notable drawback in that they may lose exact location information during the encoding process. The encoding still includes general map information. However, the generation of valid and consistent trajectories is not guaranteed. This can cause the predicted trajectories to stray from the actual lanes. This paper introduces a new refinement module designed to project the predicted trajectories back onto the actual map, rectifying these discrepancies and leading towards more consistent predictions. This versatile module can be readily incorporated into a wide range of architectures. Additionally, we propose a novel scene encoder that handles all relations between agents and their environment in a single unified heterogeneous graph attention network. By analyzing the attention values on the different edges in this graph, we can gain unique insights into the neural network's inner workings leading towards a more explainable prediction.
LGOct 14, 2024
Balanced Neural ODEs: nonlinear model order reduction and Koopman operator approximationsJulius Aka, Johannes Brunnemann, Jörg Eiden et al.
Variational Autoencoders (VAEs) are a powerful framework for learning latent representations of reduced dimensionality, while Neural ODEs excel in learning transient system dynamics. This work combines the strengths of both to generate fast surrogate models with adjustable complexity reacting on time-varying inputs signals. By leveraging the VAE's dimensionality reduction using a nonhierarchical prior, our method adaptively assigns stochastic noise, naturally complementing known NeuralODE training enhancements and enabling probabilistic time series modeling. We show that standard Latent ODEs struggle with dimensionality reduction in systems with time-varying inputs. Our approach mitigates this by continuously propagating variational parameters through time, establishing fixed information channels in latent space. This results in a flexible and robust method that can learn different system complexities, e.g. deep neural networks or linear matrices. Hereby, it enables efficient approximation of the Koopman operator without the need for predefining its dimensionality. As our method balances dimensionality reduction and reconstruction accuracy, we call it Balanced Neural ODE (B-NODE). We demonstrate the effectiveness of this methods on several academic and real-world test cases, e.g. a power plant or MuJoCo data.
ROApr 22, 2025
Dynamic Intent Queries for Motion Transformer-based Trajectory PredictionTobias Demmler, Lennart Hartung, Andreas Tamke et al.
In autonomous driving, accurately predicting the movements of other traffic participants is crucial, as it significantly influences a vehicle's planning processes. Modern trajectory prediction models strive to interpret complex patterns and dependencies from agent and map data. The Motion Transformer (MTR) architecture and subsequent work define the most accurate methods in common benchmarks such as the Waymo Open Motion Benchmark. The MTR model employs pre-generated static intention points as initial goal points for trajectory prediction. However, the static nature of these points frequently leads to misalignment with map data in specific traffic scenarios, resulting in unfeasible or unrealistic goal points. Our research addresses this limitation by integrating scene-specific dynamic intention points into the MTR model. This adaptation of the MTR model was trained and evaluated on the Waymo Open Motion Dataset. Our findings demonstrate that incorporating dynamic intention points has a significant positive impact on trajectory prediction accuracy, especially for predictions over long time horizons. Furthermore, we analyze the impact on ground truth trajectories which are not compliant with the map data or are illegal maneuvers.
CVOct 25, 2025
GeoDiffusion: A Training-Free Framework for Accurate 3D Geometric Conditioning in Image GenerationPhillip Mueller, Talip Uenlue, Sebastian Schmidt et al.
Precise geometric control in image generation is essential for engineering \& product design and creative industries to control 3D object features accurately in image space. Traditional 3D editing approaches are time-consuming and demand specialized skills, while current image-based generative methods lack accuracy in geometric conditioning. To address these challenges, we propose GeoDiffusion, a training-free framework for accurate and efficient geometric conditioning of 3D features in image generation. GeoDiffusion employs a class-specific 3D object as a geometric prior to define keypoints and parametric correlations in 3D space. We ensure viewpoint consistency through a rendered image of a reference 3D object, followed by style transfer to meet user-defined appearance specifications. At the core of our framework is GeoDrag, improving accuracy and speed of drag-based image editing on geometry guidance tasks and general instructions on DragBench. Our results demonstrate that GeoDiffusion enables precise geometric modifications across various iterative design workflows.
ROJul 7, 2025
Beyond Features: How Dataset Design Influences Multi-Agent Trajectory Prediction PerformanceTobias Demmler, Jakob Häringer, Andreas Tamke et al.
Accurate trajectory prediction is critical for safe autonomous navigation, yet the impact of dataset design on model performance remains understudied. This work systematically examines how feature selection, cross-dataset transfer, and geographic diversity influence trajectory prediction accuracy in multi-agent settings. We evaluate a state-of-the-art model using our novel L4 Motion Forecasting dataset based on our own data recordings in Germany and the US. This includes enhanced map and agent features. We compare our dataset to the US-centric Argoverse 2 benchmark. First, we find that incorporating supplementary map and agent features unique to our dataset, yields no measurable improvement over baseline features, demonstrating that modern architectures do not need extensive feature sets for optimal performance. The limited features of public datasets are sufficient to capture convoluted interactions without added complexity. Second, we perform cross-dataset experiments to evaluate how effective domain knowledge can be transferred between datasets. Third, we group our dataset by country and check the knowledge transfer between different driving cultures.
LGMay 22, 2025
Masked Conditioning for Deep Generative ModelsPhillip Mueller, Jannik Wiese, Sebastian Mueller et al.
Datasets in engineering domains are often small, sparsely labeled, and contain numerical as well as categorical conditions. Additionally. computational resources are typically limited in practical applications which hinders the adoption of generative models for engineering tasks. We introduce a novel masked-conditioning approach, that enables generative models to work with sparse, mixed-type data. We mask conditions during training to simulate sparse conditions at inference time. For this purpose, we explore the use of various sparsity schedules that show different strengths and weaknesses. In addition, we introduce a flexible embedding that deals with categorical as well as numerical conditions. We integrate our method into an efficient variational autoencoder as well as a latent diffusion model and demonstrate the applicability of our approach on two engineering-related datasets of 2D point clouds and images. Finally, we show that small models trained on limited data can be coupled with large pretrained foundation models to improve generation quality while retaining the controllability induced by our conditioning scheme.
CVMar 18, 2025
MeshFleet: Filtered and Annotated 3D Vehicle Dataset for Domain Specific Generative ModelingDamian Boborzi, Phillip Mueller, Jonas Emrich et al.
Generative models have recently made remarkable progress in the field of 3D objects. However, their practical application in fields like engineering remains limited since they fail to deliver the accuracy, quality, and controllability needed for domain-specific tasks. Fine-tuning large generative models is a promising perspective for making these models available in these fields. Creating high-quality, domain-specific 3D datasets is crucial for fine-tuning large generative models, yet the data filtering and annotation process remains a significant bottleneck. We present MeshFleet, a filtered and annotated 3D vehicle dataset extracted from Objaverse-XL, the most extensive publicly available collection of 3D objects. Our approach proposes a pipeline for automated data filtering based on a quality classifier. This classifier is trained on a manually labeled subset of Objaverse, incorporating DINOv2 and SigLIP embeddings, refined through caption-based analysis and uncertainty estimation. We demonstrate the efficacy of our filtering method through a comparative analysis against caption and image aesthetic score-based techniques and fine-tuning experiments with SV3D, highlighting the importance of targeted data selection for domain-specific 3D generative modeling.
LGJun 12, 2024
Learnable & Interpretable Model Combination in Dynamical Systems ModelingTobias Thummerer, Lars Mikelsons
During modeling of dynamical systems, often two or more model architectures are combined to obtain a more powerful or efficient model regarding a specific application area. This covers the combination of multiple machine learning architectures, as well as hybrid models, i.e., the combination of physical simulation models and machine learning. In this work, we briefly discuss which types of model are usually combined in dynamical systems modeling and propose a class of models that is capable of expressing mixed algebraic, discrete, and differential equation-based models. Further, we examine different established, as well as new ways of combining these models from the point of view of system theory and highlight two challenges - algebraic loops and local event functions in discontinuous models - that require a special approach. Finally, we propose a new wildcard architecture that is capable of describing arbitrary combinations of models in an easy-to-interpret fashion that can be learned as part of a gradient-based optimization procedure. In a final experiment, different combination architectures between two models are learned, interpreted, and compared using the methodology and software implementation provided.
LGFeb 9, 2022
Imitation Learning by State-Only Distribution MatchingDamian Boborzi, Christoph-Nikolas Straehle, Jens S. Buchner et al.
Imitation Learning from observation describes policy learning in a similar way to human learning. An agent's policy is trained by observing an expert performing a task. While many state-only imitation learning approaches are based on adversarial imitation learning, one main drawback is that adversarial training is often unstable and lacks a reliable convergence estimator. If the true environment reward is unknown and cannot be used to select the best-performing model, this can result in bad real-world policy performance. We propose a non-adversarial learning-from-observations approach, together with an interpretable convergence and performance metric. Our training objective minimizes the Kulback-Leibler divergence (KLD) between the policy and expert state transition trajectories which can be optimized in a non-adversarial fashion. Such methods demonstrate improved robustness when learned density models guide the optimization. We further improve the sample efficiency by rewriting the KLD minimization as the Soft Actor Critic objective based on a modified reward using additional density models that estimate the environment's forward and backward dynamics. Finally, we evaluate the effectiveness of our approach on well-known continuous control environments and show state-of-the-art performance while having a reliable performance estimator compared to several recent learning-from-observation methods.
LGSep 10, 2021
Hybrid modeling of the human cardiovascular system using NeuralFMUsTobias Thummerer, Johannes Tintenherr, Lars Mikelsons
Hybrid modeling, the combination of first principle and machine learning models, is an emerging research field that gathers more and more attention. Even if hybrid models produce formidable results for academic examples, there are still different technical challenges that hinder the use of hybrid modeling in real-world applications. By presenting NeuralFMUs, the fusion of a FMU, a numerical ODE solver and an ANN, we are paving the way for the use of a variety of first principle models from different modeling tools as parts of hybrid models. This contribution handles the hybrid modeling of a complex, real-world example: Starting with a simplified 1D-fluid model of the human cardiovascular system (arterial side), the aim is to learn neglected physical effects like arterial elasticity from data. We will show that the hybrid modeling process is more comfortable, needs less system knowledge and is therefore less error-prone compared to modeling solely based on first principle. Further, the resulting hybrid model has improved in computation performance, compared to a pure first principle white-box model, while still fulfilling the requirements regarding accuracy of the considered hemodynamic quantities. The use of the presented techniques is explained in a general manner and the considered use-case can serve as example for other modeling and simulation applications in and beyond the medical domain.