Roland Herzog

LG
h-index4
15papers
128citations
Novelty41%
AI Score50

15 Papers

64.7OCMay 4
Subspace accelerated measure transport methods for fast and scalable sequential experimental design, with application to photoacoustic imaging

Tiangang Cui, Karina Koval, Roland Herzog et al.

We propose a novel approach for sequential optimal experimental design (sOED) for Bayesian inverse problems involving expensive models with high-dimensional unknown parameters. This work focuses on designs that maximize the expected information gain (EIG) from prior to posterior, a task that is computationally very challenging in non-Gaussian settings. This challenge is amplified in sOED, as the incremental expected information gain (iEIG) must be repeatedly approximated across distinct stages, with both prior and posterior distributions being intractable. To address this, we derive a general-purpose, derivative-based upper bound for the iEIG, which not only guides design placement but also enables the construction of projectors onto likelihood-informed subspaces, facilitating parameter dimension reduction. By combining this approach with conditional measure transport maps for the sequence of posteriors, we develop a unified sOED and amortized inference framework scalable to high- and infinite-dimensional problems. Numerical experiments for two inverse problems governed by partial differential equations (PDEs) demonstrate the effectiveness of designs by maximizing the proposed bound.

NAMar 27, 2019
Intrinsic formulation of KKT conditions and constraint qualifications on smooth manifolds

Ronny Bergmann, Roland Herzog

Karush-Kuhn-Tucker (KKT) conditions for equality and inequality constrained optimization problems on smooth manifolds are formulated. Under the Guignard constraint qualification, local minimizers are shown to admit Lagrange multipliers. The linear independence, Mangasarian-Fromovitz, and Abadie constraint qualifications are also formulated, and the chain "LICQ implies MFCQ implies ACQ implies GCQ" is proved. Moreover, classical connections between these constraint qualifications and the set of Lagrange multipliers are established, which parallel the results in Euclidean space. The constrained Riemannian center of mass on the sphere serves as an illustrating numerical example.

NAAug 16, 2018
Discrete Total Variation with Finite Elements and Applications to Imaging

Marc Herrmann, Roland Herzog, Stephan Schmidt et al.

The total variation (TV)-seminorm is considered for piecewise polynomial, globally discontinuous (DG) and continuous (CG) finite element functions on simplicial meshes. A novel, discrete variant (DTV) based on a nodal quadrature formula is defined. DTV has favorable properties, compared to the original TV-seminorm for finite element functions. These include a convenient dual representation in terms of the supremum over the space of Raviart--Thomas finite element functions, subject to a set of simple constraints. It can therefore be shown that a variety of algorithms for classical image reconstruction problems, including TV-$L^2$ and TV-$L^1$, can be implemented in low and higher-order finite element spaces with the same efficiency as their counterparts originally developed for images on Cartesian grids.

LGSep 6, 2023Code
Introducing Thermodynamics-Informed Symbolic Regression -- A Tool for Thermodynamic Equations of State Development

Viktor Martinek, Ophelia Frotscher, Markus Richter et al.

Thermodynamic equations of state (EOS) are essential for many industries as well as in academia. Even leaving aside the expensive and extensive measurement campaigns required for the data acquisition, the development of EOS is an intensely time-consuming process, which does often still heavily rely on expert knowledge and iterative fine-tuning. To improve upon and accelerate the EOS development process, we introduce thermodynamics-informed symbolic regression (TiSR), a symbolic regression (SR) tool aimed at thermodynamic EOS modeling. TiSR is already a capable SR tool, which was used in the research of https://doi.org/10.1007/s10765-023-03197-z. It aims to combine an SR base with the extensions required to work with often strongly scattered experimental data, different residual pre- and post-processing options, and additional features required to consider thermodynamic EOS development. Although TiSR is not ready for end users yet, this paper is intended to report on its current state, showcase the progress, and discuss (distant and not so distant) future directions. TiSR is available at https://github.com/scoop-group/TiSR and can be cited as https://doi.org/10.5281/zenodo.8317547.

NAJan 12, 2018
Fast iterative solvers for an optimal transport problem

Roland Herzog, John W. Pearson, Martin Stoll

Optimal transport problems pose many challenges when considering their numerical treatment. We investigate the solution of a PDE-constrained optimisation problem subject to a particular transport equation arising from the modelling of image metamorphosis. We present the nonlinear optimisation problem, and discuss the discretisation and treatment of the nonlinearity via a Gauss--Newton scheme. We then derive preconditioners that can be used to solve the linear systems at the heart of the (Gauss--)Newton method. With the optical flow in mind, we further propose the reduction of dimensionality by choosing a radial basis function discretisation that uses the centres of superpixels as the collocation points. Again, we derive suitable preconditioners that can be used for this formulation.

LGJun 28, 2023Code
Time Regularization in Optimal Time Variable Learning

Evelyn Herberg, Roland Herzog, Frederik Köhne

Recently, optimal time variable learning in deep neural networks (DNNs) was introduced in arXiv:2204.08528. In this manuscript we extend the concept by introducing a regularization term that directly relates to the time horizon in discrete dynamical systems. Furthermore, we propose an adaptive pruning approach for Residual Neural Networks (ResNets), which reduces network complexity without compromising expressiveness, while simultaneously decreasing training time. The results are illustrated by applying the proposed concepts to classification tasks on the well known MNIST and Fashion MNIST data sets. Our PyTorch code is available on https://github.com/frederikkoehne/time_variable_learning.

OCAug 17, 2018
Optimum Experimental Design for Interface Identification Problems

Tommy Etling, Roland Herzog, Martin Siebenborn

The identification of the interface of an inclusion in a diffusion process is considered. This task is viewed as a parameter identification problem in which the parameter space bears the structure of a shape manifold. A corresponding optimum experimental design (OED) problem is formulated in which the activation pattern of an array of sensors in space and time serves as experimental condition. The goal is to improve the estimation precision within a certain subspace of the infinite dimensional tangent space of shape variations to the manifold, and to find those shape variations of best and worst identifiability. Numerical results for the OED problem obtained by a simplicial decomposition algorithm are presented.

LGNov 27, 2023Code
SensLI: Sensitivity-Based Layer Insertion for Neural Networks

Leonie Kreis, Evelyn Herberg, Frederik Köhne et al.

The training of neural networks requires tedious and often manual tuning of the network architecture. We propose a systematic approach to inserting new layers during the training process. Our method eliminates the need to choose a fixed network size before training, is numerically inexpensive to execute and applicable to various architectures including fully connected feedforward networks, ResNets and CNNs. Our technique borrows ideas from constrained optimization and is based on first-order sensitivity information of the loss function with respect to the virtual parameters that additional layers, if inserted, would offer. In numerical experiments, our proposed sensitivity-based layer insertion technique (SensLI) exhibits improved performance on training loss and test error, compared to training on a fixed architecture, and reduced computational effort in comparison to training the extended architecture from the beginning. Our code is available on https://github.com/mathemml/SensLI.

LGNov 26, 2023
Frobenius-Type Norms and Inner Products of Matrices and Linear Maps with Applications to Neural Network Training

Roland Herzog, Frederik Köhne, Leonie Kreis et al.

The Frobenius norm is a frequent choice of norm for matrices. In particular, the underlying Frobenius inner product is typically used to evaluate the gradient of an objective with respect to matrix variable, such as those occuring in the training of neural networks. We provide a broader view on the Frobenius norm and inner product for linear maps or matrices, and establish their dependence on inner products in the domain and co-domain spaces. This shows that the classical Frobenius norm is merely one special element of a family of more general Frobenius-type norms. The significant extra freedom furnished by this realization can be used, among other things, to precondition neural network training.

OCNov 28, 2023
Adaptive Step Sizes for Preconditioned Stochastic Gradient Descent

Frederik Köhne, Leonie Kreis, Anton Schiela et al.

This paper proposes a novel approach to adaptive step sizes in stochastic gradient descent (SGD) by utilizing quantities that we have identified as numerically traceable -- the Lipschitz constant for gradients and a concept of the local variance in search directions. Our findings yield a nearly hyperparameter-free algorithm for stochastic optimization, which has provable convergence properties and exhibits truly problem adaptive behavior on classical image classification tasks. Our framework is set in a general Hilbert space and thus enables the potential inclusion of a preconditioner through the choice of the inner product.

LGJan 7
Symbolic Regression for Shared Expressions: Introducing Partial Parameter Sharing

Viktor Martinek, Roland Herzog

Symbolic Regression aims to find symbolic expressions that describe datasets. Due to better interpretability, it is a machine learning paradigm particularly powerful for scientific discovery. In recent years, several works have expanded the concept to allow the description of similar phenomena using a single expression with varying sets of parameters, thereby introducing categorical variables. Some previous works allow only "non-shared" (category-value-specific) parameters, and others also incorporate "shared" (category-value-agnostic) parameters. We expand upon those efforts by considering multiple categorical variables, and introducing intermediate levels of parameter sharing. With two categorical variables, an intermediate level of parameter sharing emerges, i.e., parameters which are shared across either category but change across the other. The new approach potentially decreases the number of parameters, while revealing additional information about the problem. Using a synthetic, fitting-only example, we test the limits of this setup in terms of data requirement reduction and transfer learning. As a real-world symbolic regression example, we demonstrate the benefits of the proposed approach on an astrophysics dataset used in a previous study, which considered only one categorical variable. We achieve a similar fit quality but require significantly fewer individual parameters, and extract additional information about the problem.

CVNov 6, 2025
Geometry Denoising with Preferred Normal Vectors

Manuel Weiß, Lukas Baumgärtner, Roland Herzog et al.

We introduce a new paradigm for geometry denoising using prior knowledge about the surface normal vector. This prior knowledge comes in the form of a set of preferred normal vectors, which we refer to as label vectors. A segmentation problem is naturally embedded in the denoising process. The segmentation is based on the similarity of the normal vector to the elements of the set of label vectors. Regularization is achieved by a total variation term. We formulate a split Bregman (ADMM) approach to solve the resulting optimization problem. The vertex update step is based on second-order shape calculus.

CVNov 30, 2024
Two Models for Surface Segmentation using the Total Variation of the Normal Vector

Lukas Baumgärtner, Ronny Bergmann, Roland Herzog et al.

We consider the problem of surface segmentation, where the goal is to partition a surface represented by a triangular mesh. The segmentation is based on the similarity of the normal vector field to a given set of label vectors. We propose a variational approach and compare two different regularizers, both based on a total variation measure. The first regularizer penalizes the total variation of the assignment function directly, while the second regularizer penalizes the total variation in the label space. In order to solve the resulting optimization problems, we use variations of the split Bregman (ADMM) iteration adapted to the problem at hand. While computationally more expensive, the second regularizer yields better results in our experiments, in particular it removes noise more reliably in regions of constant curvature.

CVJul 17, 2025
Total Generalized Variation of the Normal Vector Field and Applications to Mesh Denoising

Lukas Baumgärtner, Ronny Bergmann, Roland Herzog et al.

We propose a novel formulation for the second-order total generalized variation (TGV) of the normal vector on an oriented, triangular mesh embedded in $\R^3$. The normal vector is considered as a manifold-valued function, taking values on the unit sphere. Our formulation extends previous discrete TGV models for piecewise constant scalar data that utilize a Raviart-Thomas function space. To extend this formulation to the manifold setting, a tailor-made tangential Raviart-Thomas type finite element space is constructed in this work. The new regularizer is compared to existing methods in mesh denoising experiments.

NASep 7, 2016
A modified implementation of MINRES to monitor residual subvector norms for block systems

Roland Herzog, Kirk M. Soodhalter

Saddle-point systems, i.e., structured linear systems with symmetric matrices are considered. A modified implementation of (preconditioned) MINRES is derived which allows to monitor the norms of the subvectors individually. Compared to the implementation from the textbook of [Elman, Sylvester and Wathen, Oxford University Press, 2014], our method requires one extra vector of storage and no additional applications of the preconditioner. Numerical experiments are included.