Jonas Latz

NA
h-index69
13papers
364citations
Novelty43%
AI Score28

13 Papers

NAFeb 8, 2023
Can Physics-Informed Neural Networks beat the Finite Element Method?

Tamara G. Grossmann, Urszula Julia Komorowska, Jonas Latz et al.

Partial differential equations play a fundamental role in the mathematical modelling of many processes and systems in physical, biological and other sciences. To simulate such processes and systems, the solutions of PDEs often need to be approximated numerically. The finite element method, for instance, is a usual standard methodology to do so. The recent success of deep neural networks at various approximation tasks has motivated their use in the numerical solution of PDEs. These so-called physics-informed neural networks and their variants have shown to be able to successfully approximate a large range of partial differential equations. So far, physics-informed neural networks and the finite element method have mainly been studied in isolation of each other. In this work, we compare the methodologies in a systematic computational study. Indeed, we employ both methods to numerically solve various linear and nonlinear partial differential equations: Poisson in 1D, 2D, and 3D, Allen-Cahn in 1D, semilinear Schrödinger in 1D and 2D. We then compare computational costs and approximation accuracies. In terms of solution time and accuracy, physics-informed neural networks have not been able to outperform the finite element method in our study. In some experiments, they were faster at evaluating the solved PDE.

CVAug 11, 2022
Joint reconstruction-segmentation on graphs

Jeremy Budd, Yves van Gennip, Jonas Latz et al.

Practical image segmentation tasks concern images which must be reconstructed from noisy, distorted, and/or incomplete observations. A recent approach for solving such tasks is to perform this reconstruction jointly with the segmentation, using each to guide the other. However, this work has so far employed relatively simple segmentation methods, such as the Chan--Vese algorithm. In this paper, we present a method for joint reconstruction-segmentation using graph-based segmentation methods, which have been seeing increasing recent interest. Complications arise due to the large size of the matrices involved, and we show how these complications can be managed. We then analyse the convergence properties of our scheme. Finally, we apply this scheme to distorted versions of ``two cows'' images familiar from previous graph-based segmentation literature, first to a highly noised version and second to a blurred version, achieving highly accurate segmentations in both cases. We compare these results to those obtained by sequential reconstruction-segmentation approaches, finding that our method competes with, or even outperforms, those approaches in terms of reconstruction and segmentation accuracy.

NAFeb 15, 2019
Bayesian parameter identification in Cahn-Hilliard models for biological growth

Christian Kahle, Kei Fong Lam, Jonas Latz et al.

We consider the inverse problem of parameter estimation in a diffuse interface model for tumour growth. The model consists of a fourth-order Cahn-Hilliard system and contains three phenomenological parameters: the tumour proliferation rate, the nutrient consumption rate, and the chemotactic sensitivity. We study the inverse problem within the Bayesian framework and construct the likelihood and noise for two typical observation settings. One setting involves an infinite-dimensional data space where we observe the full tumour. In the second setting we observe only the tumour volume, hence the data space is finite-dimensional. We show the well-posedness of the posterior measure for both settings, building upon and improving the analytical results in [C. Kahle and K.F. Lam, Appl. Math. Optim. (2018)]. A numerical example involving synthetic data is presented in which the posterior measure is numerically approximated by the sequential Monte Carlo approach with tempering.

COFeb 22, 2019
A practical example for the non-linear Bayesian filtering of model parameters

Matthieu Bulté, Jonas Latz, Elisabeth Ullmann

In this tutorial we consider the non-linear Bayesian filtering of static parameters in a time-dependent model. We outline the theoretical background and discuss appropriate solvers. We focus on particle-based filters and present Sequential Importance Sampling (SIS) and Sequential Monte Carlo (SMC). Throughout the paper we illustrate the concepts and techniques with a practical example using real-world data. The task is to estimate the gravitational acceleration of the Earth $g$ by using observations collected from a simple pendulum. Importantly, the particle filters enable the adaptive updating of the estimate for $g$ as new observations become available. For tutorial purposes we provide the data set and a Python implementation of the particle filters.

OCSep 8, 2022
Losing momentum in continuous-time stochastic optimisation

Kexin Jin, Jonas Latz, Chenguang Liu et al.

The training of modern machine learning models often consists in solving high-dimensional non-convex optimisation problems that are subject to large-scale data. In this context, momentum-based stochastic optimisation algorithms have become particularly widespread. The stochasticity arises from data subsampling which reduces computational cost. Both, momentum and stochasticity help the algorithm to converge globally. In this work, we propose and analyse a continuous-time model for stochastic gradient descent with momentum. This model is a piecewise-deterministic Markov process that represents the optimiser by an underdamped dynamical system and the data subsampling through a stochastic switching. We investigate longtime limits, the subsampling-to-no-subsampling limit, and the momentum-to-no-momentum limit. We are particularly interested in the case of reducing the momentum over time. Under convexity assumptions, we show convergence of our dynamical system to the global minimiser when reducing momentum over time and letting the subsampling rate go to infinity. We then propose a stable, symplectic discretisation scheme to construct an algorithm from our continuous-time dynamical system. In experiments, we study our scheme in convex and non-convex test problems. Additionally, we train a convolutional neural network in an image classification problem. Our algorithm {attains} competitive results compared to stochastic gradient descent with momentum.

NAMar 22, 2022
Gradient flows and randomised thresholding: sparse inversion and classification

Jonas Latz

Sparse inversion and classification problems are ubiquitous in modern data science and imaging. They are often formulated as non-smooth minimisation problems. In sparse inversion, we minimise, e.g., the sum of a data fidelity term and an L1/LASSO regulariser. In classification, we consider, e.g., the sum of a data fidelity term and a non-smooth Ginzburg--Landau energy. Standard (sub)gradient descent methods have shown to be inefficient when approaching such problems. Splitting techniques are much more useful: here, the target function is partitioned into a sum of two subtarget functions -- each of which can be efficiently optimised. Splitting proceeds by performing optimisation steps alternately with respect to each of the two subtarget functions. In this work, we study splitting from a stochastic continuous-time perspective. Indeed, we define a differential inclusion that follows one of the two subtarget function's negative subdifferential at each point in time. The choice of the subtarget function is controlled by a binary continuous-time Markov process. The resulting dynamical system is a stochastic approximation of the underlying subgradient flow. We investigate this stochastic approximation for an L1-regularised sparse inversion flow and for a discrete Allen-Cahn equation minimising a Ginzburg--Landau energy. In both cases, we study the longtime behaviour of the stochastic dynamical system and its ability to approximate the underlying subgradient flow at any accuracy. We illustrate our theoretical findings in a simple sparse estimation problem and also in low- and high-dimensional classification problems.

LGJul 11, 2024
How to beat a Bayesian adversary

Zihan Ding, Kexin Jin, Jonas Latz et al.

Deep neural networks and other modern machine learning models are often susceptible to adversarial attacks. Indeed, an adversary may often be able to change a model's prediction through a small, directed perturbation of the model's input - an issue in safety-critical applications. Adversarially robust machine learning is usually based on a minmax optimisation problem that minimises the machine learning loss under maximisation-based adversarial attacks. In this work, we study adversaries that determine their attack using a Bayesian statistical approach rather than maximisation. The resulting Bayesian adversarial robustness problem is a relaxation of the usual minmax problem. To solve this problem, we propose Abram - a continuous-time particle system that shall approximate the gradient flow corresponding to the underlying learning problem. We show that Abram approximates a McKean-Vlasov process and justify the use of Abram by giving assumptions under which the McKean-Vlasov process finds the minimiser of the Bayesian adversarial robustness problem. We discuss two ways to discretise Abram and show its suitability in benchmark adversarial deep learning experiments.

MED-PHMar 7, 2024
A Learnable Prior Improves Inverse Tumor Growth Modeling

Jonas Weidner, Ivan Ezhov, Michal Balcerak et al.

Biophysical modeling, particularly involving partial differential equations (PDEs), offers significant potential for tailoring disease treatment protocols to individual patients. However, the inverse problem-solving aspect of these models presents a substantial challenge, either due to the high computational requirements of model-based approaches or the limited robustness of deep learning (DL) methods. We propose a novel framework that leverages the unique strengths of both approaches in a synergistic manner. Our method incorporates a DL ensemble for initial parameter estimation, facilitating efficient downstream evolutionary sampling initialized with this DL-based prior. We showcase the effectiveness of integrating a rapid deep-learning algorithm with a high-precision evolution strategy in estimating brain tumor cell concentrations from magnetic resonance images. The DL-Prior plays a pivotal role, significantly constraining the effective sampling-parameter space. This reduction results in a fivefold convergence acceleration and a Dice-score of 95%.

MLMay 23, 2023
Subsampling Error in Stochastic Gradient Langevin Diffusions

Kexin Jin, Chenguang Liu, Jonas Latz

The Stochastic Gradient Langevin Dynamics (SGLD) are popularly used to approximate Bayesian posterior distributions in statistical learning procedures with large-scale data. As opposed to many usual Markov chain Monte Carlo (MCMC) algorithms, SGLD is not stationary with respect to the posterior distribution; two sources of error appear: The first error is introduced by an Euler--Maruyama discretisation of a Langevin diffusion process, the second error comes from the data subsampling that enables its use in large-scale data settings. In this work, we consider an idealised version of SGLD to analyse the method's pure subsampling error that we then see as a best-case error for diffusion-based subsampling MCMC methods. Indeed, we introduce and study the Stochastic Gradient Langevin Diffusion (SGLDiff), a continuous-time Markov process that follows the Langevin diffusion corresponding to a data subset and switches this data subset after exponential waiting times. There, we show the exponential ergodicity of SLGDiff and that the Wasserstein distance between the posterior and the limiting distribution of SGLDiff is bounded above by a fractional power of the mean waiting time. We bring our results into context with other analyses of SGLD.

LGDec 7, 2021
A Continuous-time Stochastic Gradient Descent Method for Continuous Data

Kexin Jin, Jonas Latz, Chenguang Liu et al.

Optimization problems with continuous data appear in, e.g., robust machine learning, functional data analysis, and variational inference. Here, the target function is given as an integral over a family of (continuously) indexed target functions - integrated with respect to a probability measure. Such problems can often be solved by stochastic optimization methods: performing optimization steps with respect to the indexed target function with randomly switched indices. In this work, we study a continuous-time variant of the stochastic gradient descent algorithm for optimization problems with continuous data. This so-called stochastic gradient process consists in a gradient flow minimizing an indexed target function that is coupled with a continuous-time index process determining the index. Index processes are, e.g., reflected diffusions, pure jump processes, or other Lévy processes on compact spaces. Thus, we study multiple sampling patterns for the continuous data space and allow for data simulated or streamed at runtime of the algorithm. We analyze the approximation properties of the stochastic gradient process and study its longtime behavior and ergodicity under constant and decreasing learning rates. We end with illustrating the applicability of the stochastic gradient process in a polynomial regression problem with noisy functional data, as well as in a physics-informed neural network.

PRApr 15, 2020
Analysis of Stochastic Gradient Descent in Continuous Time

Jonas Latz

Stochastic gradient descent is an optimisation method that combines classical gradient descent with random subsampling within the target functional. In this work, we introduce the stochastic gradient process as a continuous-time representation of stochastic gradient descent. The stochastic gradient process is a dynamical system that is coupled with a continuous-time Markov process living on a finite state space. The dynamical system -- a gradient flow -- represents the gradient descent part, the process on the finite state space represents the random subsampling. Processes of this type are, for instance, used to model clonal populations in fluctuating environments. After introducing it, we study theoretical properties of the stochastic gradient process: We show that it converges weakly to the gradient flow with respect to the full target function, as the learning rate approaches zero. We give conditions under which the stochastic gradient process with constant learning rate is exponentially ergodic in the Wasserstein sense. Then we study the case, where the learning rate goes to zero sufficiently slowly and the single target functions are strongly convex. In this case, the process converges weakly to the point mass concentrated in the global minimum of the full target function; indicating consistency of the method. We conclude after a discussion of discretisation strategies for the stochastic gradient process and numerical experiments.

NAJan 24, 2020
Certified and fast computations with shallow covariance kernels

Daniel Kressner, Jonas Latz, Stefano Massei et al.

Many techniques for data science and uncertainty quantification demand efficient tools to handle Gaussian random fields, which are defined in terms of their mean functions and covariance operators. Recently, parameterized Gaussian random fields have gained increased attention, due to their higher degree of flexibility. However, especially if the random field is parameterized through its covariance operator, classical random field discretization techniques fail or become inefficient. In this work we introduce and analyze a new and certified algorithm for the low-rank approximation of a parameterized family of covariance operators which represents an extension of the adaptive cross approximation method for symmetric positive definite matrices. The algorithm relies on an affine linear expansion of the covariance operator with respect to the parameters, which needs to be computed in a preprocessing step using, e.g., the empirical interpolation method. We discuss and test our new approach for isotropic covariance kernels, such as Matérn kernels. The numerical results demonstrate the advantages of our approach in terms of computational time and confirm that the proposed algorithm provides the basis of a fast sampling procedure for parameter dependent Gaussian random fields.

STFeb 26, 2019
On the well-posedness of Bayesian inverse problems

Jonas Latz

The subject of this article is the introduction of a new concept of well-posedness of Bayesian inverse problems. The conventional concept of (Lipschitz, Hellinger) well-posedness in [Stuart 2010, Acta Numerica 19, pp. 451-559] is difficult to verify in practice and may be inappropriate in some contexts. Our concept simply replaces the Lipschitz continuity of the posterior measure in the Hellinger distance by continuity in an appropriate distance between probability measures. Aside from the Hellinger distance, we investigate well-posedness with respect to weak convergence, the total variation distance, the Wasserstein distance, and also the Kullback--Leibler divergence. We demonstrate that the weakening to continuity is tolerable and that the generalisation to other distances is important. The main results of this article are proofs of well-posedness with respect to some of the aforementioned distances for large classes of Bayesian inverse problems. Here, little or no information about the underlying model is necessary; making these results particularly interesting for practitioners using black-box models. We illustrate our findings with numerical examples motivated from machine learning and image processing.