Bertram Taetz

CV
h-index18
11papers
396citations
Novelty56%
AI Score47

11 Papers

LGApr 1, 2022
Autoencoder Attractors for Uncertainty Estimation

Steve Dias Da Cruz, Bertram Taetz, Thomas Stifter et al.

The reliability assessment of a machine learning model's prediction is an important quantity for the deployment in safety critical applications. Not only can it be used to detect novel sceneries, either as out-of-distribution or anomaly sample, but it also helps to determine deficiencies in the training data distribution. A lot of promising research directions have either proposed traditional methods like Gaussian processes or extended deep learning based approaches, for example, by interpreting them from a Bayesian point of view. In this work we propose a novel approach for uncertainty estimation based on autoencoder models: The recursive application of a previously trained autoencoder model can be interpreted as a dynamical system storing training examples as attractors. While input images close to known samples will converge to the same or similar attractor, input samples containing unknown features are unstable and converge to different training samples by potentially removing or changing characteristic features. The use of dropout during training and inference leads to a family of similar dynamical systems, each one being robust on samples close to the training distribution but unstable on new features. Either the model reliably removes these features or the resulting instability can be exploited to detect problematic input samples. We evaluate our approach on several dataset combinations as well as on an industrial application for occupant classification in the vehicle interior for which we additionally release a new synthetic dataset.

NAFeb 9, 2011
An Unstaggered Constrained Transport Method for the 3D Ideal Magnetohydrodynamic Equations

Christiane Helzel, James A. Rossmanith, Bertram Taetz

Numerical methods for solving the ideal magnetohydrodynamic (MHD) equations in more than one space dimension must either confront the challenge of controlling errors in the discrete divergence of the magnetic field, or else be faced with nonlinear numerical instabilities. One approach for controlling the discrete divergence is through a so-called constrained transport method, which is based on first predicting a magnetic field through a standard finite volume solver, and then correcting this field through the appropriate use of a magnetic vector potential. In this work we develop a constrained transport method for the 3D ideal MHD equations that is based on a high-resolution wave propagation scheme. Our proposed scheme is the 3D extension of the 2D scheme developed by Rossmanith [SIAM J. Sci. Comp. 28, 1766 (2006)], and is based on the high-resolution wave propagation method of Langseth and LeVeque [J. Comp. Phys. 165, 126 (2000)]. In particular, in our extension we take great care to maintain the three most important properties of the 2D scheme: (1) all quantities, including all components of the magnetic field and magnetic potential, are treated as cell-centered; (2) we develop a high-resolution wave propagation scheme for evolving the magnetic potential; and (3) we develop a wave limiting approach that is applied during the vector potential evolution, which controls unphysical oscillations in the magnetic field. One of the key numerical difficulties that is novel to 3D is that the transport equation that must be solved for the magnetic vector potential is only weakly hyperbolic. In presenting our numerical algorithm we describe how to numerically handle this problem of weak hyperbolicity, as well as how to choose an appropriate gauge condition. The resulting scheme is applied to several numerical test cases.

CVApr 1, 2022
Autoencoder for Synthetic to Real Generalization: From Simple to More Complex Scenes

Steve Dias Da Cruz, Bertram Taetz, Thomas Stifter et al.

Learning on synthetic data and transferring the resulting properties to their real counterparts is an important challenge for reducing costs and increasing safety in machine learning. In this work, we focus on autoencoder architectures and aim at learning latent space representations that are invariant to inductive biases caused by the domain shift between simulated and real images showing the same scenario. We train on synthetic images only, present approaches to increase generalizability and improve the preservation of the semantics to real datasets of increasing visual complexity. We show that pre-trained feature extractors (e.g. VGG) can be sufficient for generalization on images of lower complexity, but additional improvements are required for visually more complex scenes. To this end, we demonstrate a new sampling technique, which matches semantically important parts of the image, while randomizing the other parts, leads to salient feature extraction and a neglection of unimportant parts. This helps the generalization to real data and we further show that our approach outperforms fine-tuned classification models.

NAOct 15, 2012
A high-order unstaggered constrained transport method for the 3D ideal magnetohydrodynamic equations based on the method of lines

Christiane Helzel, James A. Rossmanith, Bertram Taetz

Numerical methods for solving the ideal magnetohydrodynamic (MHD) equations in more than one space dimension must confront the challenge of controlling errors in the discrete divergence of the magnetic field. One approach that has been shown successful in stabilizing MHD calculations are constrained transport (CT) schemes. CT schemes can be viewed as predictor-corrector methods for updating the magnetic field, where a magnetic field value is first predicted by a method that does not exactly preserve the divergence-free condition on the magnetic field, followed by a correction step that aims to control these divergence errors. In Helzel et al. (2011) the authors presented an unstaggered constrained transport method for the MHD equations on 3D Cartesian grids. In this work we generalize the method of Helzel et al. (2011) in three important ways: (1) we remove the need for operator splitting by switching to an appropriate method of lines discretization and coupling this with a non-conservative finite volume method for the magnetic vector potential equation, (2) we increase the spatial and temporal order of accuracy of the entire method to third order, and (3) we develop the method so that it is applicable on both Cartesian and logically rectangular mapped grids. The evolution equation for the magnetic vector potential is solved using a non-conservative finite volume method. The curl of the magnetic potential is computed via a third-order accurate discrete operator that is derived from appropriate application of the divergence theorem and subsequent numerical quadrature on element faces. Special artificial resistivity limiters are used to control unphysical oscillations in the magnetic potential and field components across shocks. Test computations are shown that confirm third order accuracy for smooth test problems and high-resolution for test problems with shock waves.

17.7CVApr 17
Amortized Inverse Kinematics via Graph Attention for Real-Time Human Avatar Animation

Muhammad Saif Ullah Khan, Chen-Yu Wang, Tim Prokosch et al.

Inverse kinematics (IK) is a core operation in animation, robotics, and biomechanics: given Cartesian constraints, recover joint rotations under a known kinematic tree. In many real-time human avatar pipelines, the available signal per frame is a sparse set of tracked 3D joint positions, whereas animation systems require joint orientations to drive skinning. Recovering full orientations from positions is underconstrained, most notably because twist about bone axes is ambiguous, and classical IK solvers typically rely on iterative optimization that can be slow and sensitive to noisy inputs. We introduce IK-GAT, a lightweight graph-attention network that reconstructs full-body joint orientations from 3D joint positions in a single forward pass. The model performs message passing over the skeletal parent-child graph to exploit kinematic structure during rotation inference. To simplify learning, IK-GAT predicts rotations in a bone-aligned world-frame representation anchored to rest-pose bone frames. This parameterization makes the twist axis explicit and is exactly invertible to standard parent-relative local rotations given the kinematic tree and rest pose. The network uses a continuous 6D rotation representation and is trained with a geodesic loss on SO(3) together with an optional forward-kinematics consistency regularizer. IK-GAT produces animation-ready local rotations that can directly drive a rigged avatar or be converted to pose parameters of SMPL-like body models for real-time and online applications. With 374K parameters and over 650 FPS on CPU, IK-GAT outperforms VPoser-based per-frame iterative optimization without warm-start at significantly lower cost, and is robust to initial pose and input noise

CVOct 7, 2025Code
Continual Learning for Image Captioning through Improved Image-Text Alignment

Bertram Taetz, Gal Bordelius

Generating accurate and coherent image captions in a continual learning setting remains a major challenge due to catastrophic forgetting and the difficulty of aligning evolving visual concepts with language over time. In this work, we propose a novel multi-loss framework for continual image captioning that integrates semantic guidance through prompt-based continual learning and contrastive alignment. Built upon a pretrained ViT-GPT-2 backbone, our approach combines standard cross-entropy loss with three additional components: (1) a prompt-based cosine similarity loss that aligns image embeddings with synthetically constructed prompts encoding objects, attributes, and actions; (2) a CLIP-style loss that promotes alignment between image embeddings and target caption embedding; and (3) a language-guided contrastive loss that employs a triplet loss to enhance class-level discriminability between tasks. Notably, our approach introduces no additional overhead at inference time and requires no prompts during caption generation. We find that this approach mitigates catastrophic forgetting, while achieving better semantic caption alignment compared to state-of-the-art methods. The code can be found via the following link: https://github.com/Gepardius/Taetz_Bordelius_Continual_ImageCaptioning.

CVMay 7, 2021
Autoencoder Based Inter-Vehicle Generalization for In-Cabin Occupant Classification

Steve Dias Da Cruz, Bertram Taetz, Oliver Wasenmüller et al.

Common domain shift problem formulations consider the integration of multiple source domains, or the target domain during training. Regarding the generalization of machine learning models between different car interiors, we formulate the criterion of training in a single vehicle: without access to the target distribution of the vehicle the model would be deployed to, neither with access to multiple vehicles during training. We performed an investigation on the SVIRO dataset for occupant classification on the rear bench and propose an autoencoder based approach to improve the transferability. The autoencoder is on par with commonly used classification models when trained from scratch and sometimes out-performs models pre-trained on a large amount of data. Moreover, the autoencoder can transform images from unknown vehicles into the vehicle it was trained on. These results are corroborated by an evaluation on real infrared images from two vehicle interiors.

CVNov 6, 2020
Illumination Normalization by Partially Impossible Encoder-Decoder Cost Function

Steve Dias Da Cruz, Bertram Taetz, Thomas Stifter et al.

Images recorded during the lifetime of computer vision based systems undergo a wide range of illumination and environmental conditions affecting the reliability of previously trained machine learning models. Image normalization is hence a valuable preprocessing component to enhance the models' robustness. To this end, we introduce a new strategy for the cost function formulation of encoder-decoder networks to average out all the unimportant information in the input images (e.g. environmental features and illumination changes) to focus on the reconstruction of the salient features (e.g. class instances). Our method exploits the availability of identical sceneries under different illumination and environmental conditions for which we formulate a partially impossible reconstruction target: the input image will not convey enough information to reconstruct the target in its entirety. Its applicability is assessed on three publicly available datasets. We combine the triplet loss as a regularizer in the latent space representation and a nearest neighbour search to improve the generalization to unseen illuminations and class instances. The importance of the aforementioned post-processing is highlighted on an automotive application. To this end, we release a synthetic dataset of sceneries from three different passenger compartments where each scenery is rendered under ten different illumination and environmental conditions: see https://sviro.kl.dfki.de

CVMar 7, 2017
Flow Fields: Dense Correspondence Fields for Highly Accurate Large Displacement Optical Flow Estimation

Christian Bailer, Bertram Taetz, Didier Stricker

Modern large displacement optical flow algorithms usually use an initialization by either sparse descriptor matching techniques or dense approximate nearest neighbor fields. While the latter have the advantage of being dense, they have the major disadvantage of being very outlier-prone as they are not designed to find the optical flow, but the visually most similar correspondence. In this article we present a dense correspondence field approach that is much less outlier-prone and thus much better suited for optical flow estimation than approximate nearest neighbor fields. Our approach does not require explicit regularization, smoothing (like median filtering) or a new data term. Instead we solely rely on patch matching techniques and a novel multi-scale matching strategy. We also present enhancements for outlier filtering. We show that our approach is better suited for large displacement optical flow estimation than modern descriptor matching techniques. We do so by initializing EpicFlow with our approach instead of their originally used state-of-the-art descriptor matching technique. We significantly outperform the original EpicFlow on MPI-Sintel, KITTI 2012, KITTI 2015 and Middlebury. In this extended article of our former conference publication we further improve our approach in matching accuracy as well as runtime and present more experiments and insights.

SYJun 12, 2016
Towards Self-Calibrating Inertial Body Motion Capture

Bertram Taetz, Gabriele Bleser, Markus Miezal

This paper presents a novel online capable method for simultaneous estimation of human motion in terms of segment orientations and positions along with sensor-to-segment calibration parameters from inertial sensors attached to the body. In order to solve this ill-posed estimation problem, state-of-the-art motion, measurement and biomechanical models are combined with new stochastic equations and priors. These are based on the kinematics of multi-body systems, anatomical and body shape information, as well as, parameter properties for regularisation. This leads to a constrained weighted least squares problem that is solved in a sliding window fashion. Magnetometer information is currently only used for initialisation, while the estimation itself works without magnetometers. The method was tested on simulated, as well as, on real data, captured from a lower body configuration.

CVAug 21, 2015
Flow Fields: Dense Correspondence Fields for Highly Accurate Large Displacement Optical Flow Estimation

Christian Bailer, Bertram Taetz, Didier Stricker

Modern large displacement optical flow algorithms usually use an initialization by either sparse descriptor matching techniques or dense approximate nearest neighbor fields. While the latter have the advantage of being dense, they have the major disadvantage of being very outlier prone as they are not designed to find the optical flow, but the visually most similar correspondence. In this paper we present a dense correspondence field approach that is much less outlier prone and thus much better suited for optical flow estimation than approximate nearest neighbor fields. Our approach is conceptually novel as it does not require explicit regularization, smoothing (like median filtering) or a new data term, but solely our novel purely data based search strategy that finds most inliers (even for small objects), while it effectively avoids finding outliers. Moreover, we present novel enhancements for outlier filtering. We show that our approach is better suited for large displacement optical flow estimation than state-of-the-art descriptor matching techniques. We do so by initializing EpicFlow (so far the best method on MPI-Sintel) with our Flow Fields instead of their originally used state-of-the-art descriptor matching technique. We significantly outperform the original EpicFlow on MPI-Sintel, KITTI and Middlebury.