Lisa Maria Kreusser

LG
10papers
71citations
Novelty41%
AI Score26

10 Papers

APMay 22, 2018
ODE and PDE based modeling of biological transportation networks

Jan Haskovec, Lisa Maria Kreusser, Peter Markowich

We study the global existence of solutions of a discrete (ODE based) model on a graph describing the formation of biological transportation networks, introduced by Hu and Cai. We propose an adaptation of this model so that a macroscopic (PDE based) system can be obtained as its formal continuum limit. We prove the global existence of weak solutions of the macroscopic PDE model. Finally, we present results of numerical simulations of the discrete model, illustrating the convergence to steady states, their non-uniqueness as well as their dependence on initial data and model parameters.

APApr 20, 2017
Pattern formation of a nonlocal, anisotropic interaction model

Martin Burger, Bertram Düring, Lisa Maria Kreusser et al.

We consider a class of interacting particle models with anisotropic, repulsive-attractive interaction forces whose orientations depend on an underlying tensor field. An example of this class of models is the so-called Kücken-Champod model describing the formation of fingerprint patterns. This class of models can be regarded as a generalization of a gradient flow of a nonlocal interaction potential which has a local repulsion and a long-range attraction structure. In contrast to isotropic interaction models the anisotropic forces in our class of models cannot be derived from a potential. The underlying tensor field introduces an anisotropy leading to complex patterns which do not occur in isotropic models. This anisotropy is characterized by one parameter in the model. We study the variation of this parameter, describing the transition between the isotropic and the anisotropic model, analytically and numerically. We analyze the equilibria of the corresponding mean-field partial differential equation and investigate pattern formation numerically in two dimensions by studying the dependence of the parameters in the model on the resulting patterns.

DSNov 20, 2017
An Anisotropic Interaction Model for Simulating Fingerprints

Bertram Düring, Carsten Gottschlich, Stephan Huckemann et al.

Evidence suggests that both the interaction of so-called Merkel cells and the epidermal stress distribution play an important role in the formation of fingerprint patterns during pregnancy. To model the formation of fingerprint patterns in a biologically meaningful way these patterns have to become stationary. For the creation of synthetic fingerprints it is also very desirable that rescaling the model parameters leads to rescaled distances between the stationary fingerprint ridges. Based on these observations, as well as the model introduced by Kücken and Champod we propose a new model for the formation of fingerprint patterns during pregnancy. In this anisotropic interaction model the interaction forces not only depend on the distance vector between the cells and the model parameters, but additionally on an underlying tensor field, representing a stress field. This dependence on the tensor field leads to complex, anisotropic patterns. We study the resulting stationary patterns both analytically and numerically. In particular, we show that fingerprint patterns can be modeled as stationary solutions by choosing the underlying tensor field appropriately.

LGNov 27, 2023
Closing the ODE-SDE gap in score-based diffusion models through the Fokker-Planck equation

Teo Deveney, Jan Stanczuk, Lisa Maria Kreusser et al.

Score-based diffusion models have emerged as one of the most promising frameworks for deep generative modelling, due to their state-of-the art performance in many generation tasks while relying on mathematical foundations such as stochastic differential equations (SDEs) and ordinary differential equations (ODEs). Empirically, it has been reported that ODE based samples are inferior to SDE based samples. In this paper we rigorously describe the range of dynamics and approximations that arise when training score-based diffusion models, including the true SDE dynamics, the neural approximations, the various approximate particle dynamics that result, as well as their associated Fokker--Planck equations and the neural network approximations of these Fokker--Planck equations. We systematically analyse the difference between the ODE and SDE dynamics of score-based diffusion models, and link it to an associated Fokker--Planck equation. We derive a theoretical upper bound on the Wasserstein 2-distance between the ODE- and SDE-induced distributions in terms of a Fokker--Planck residual. We also show numerically that conventional score-based diffusion models can exhibit significant differences between ODE- and SDE-induced distributions which we demonstrate using explicit comparisons. Moreover, we show numerically that reducing the Fokker--Planck residual by adding it as an additional regularisation term leads to closing the gap between ODE- and SDE-induced distributions. Our experiments suggest that this regularisation can improve the distribution generated by the ODE, however that this can come at the cost of degraded SDE sample quality.

NAAug 19, 2024
Parallel-in-Time Solutions with Random Projection Neural Networks

Marta M. Betcke, Lisa Maria Kreusser, Davide Murari

This paper considers one of the fundamental parallel-in-time methods for the solution of ordinary differential equations, Parareal, and extends it by adopting a neural network as a coarse propagator. We provide a theoretical analysis of the convergence properties of the proposed algorithm and show its effectiveness for several examples, including Lorenz and Burgers' equations. In our numerical simulations, we further specialize the underpinning neural architecture to Random Projection Neural Networks (RPNNs), a 2-layer neural network where the first layer weights are drawn at random rather than optimized. This restriction substantially increases the efficiency of fitting RPNN's weights in comparison to a standard feedforward network without negatively impacting the accuracy, as demonstrated in the SIR system example.

LGJul 2, 2024
Equidistribution-based training of Free Knot Splines and ReLU Neural Networks

Simone Appella, Simon Arridge, Chris Budd et al.

We consider the problem of univariate nonlinear function approximation using shallow neural networks (NN) with a rectified linear unit (ReLU) activation function. We show that the $L_2$ based approximation problem is ill-conditioned and the behaviour of optimisation algorithms used in training these networks degrades rapidly as the width of the network increases. This can lead to significantly poorer approximation in practice than expected from the theoretical expressivity of the ReLU architecture and traditional methods such as univariate Free Knot Splines (FKS). Univariate shallow ReLU NNs and FKS span the same function space, and thus have the same theoretical expressivity. However, the FKS representation remains well-conditioned as the number of knots increases. We leverage the theory of optimal piecewise linear interpolants to improve the training procedure for ReLU NNs. Using the equidistribution principle, we propose a two-level procedure for training the FKS by first solving the nonlinear problem of finding the optimal knot locations of the interpolating FKS, and then determine the optimal weights and knots of the FKS by solving a nearly linear, well-conditioned problem. The training of the FKS gives insights into how we can train a ReLU NN effectively, with an equally accurate approximation. We combine the training of the ReLU NN with an equidistribution-based loss to find the breakpoints of the ReLU functions. This is then combined with preconditioning the ReLU NN approximation to find the scalings of the ReLU functions. This fast, well-conditioned and reliable method finds an accurate shallow ReLU NN approximation to a univariate target function. We test this method on a series of regular, singular, and rapidly varying target functions and obtain good results, realising the expressivity of the shallow ReLU network in all cases. We then extend our results to deeper networks.

NAJan 19, 2022
Models for information propagation on graphs

Oliver R. A. Dunbar, Charles M. Elliott, Lisa Maria Kreusser

We propose and unify classes of different models for information propagation over graphs. In a first class, propagation is modelled as a wave which emanates from a set of \emph{known} nodes at an initial time, to all other \emph{unknown} nodes at later times with an ordering determined by the arrival time of the information wave front. A second class of models is based on the notion of a travel time along paths between nodes. The time of information propagation from an initial \emph{known} set of nodes to a node is defined as the minimum of a generalised travel time over subsets of all admissible paths. A final class is given by imposing a local equation of an eikonal form at each \emph{unknown} node, with boundary conditions at the \emph{known} nodes. The solution value of the local equation at a node is coupled to those of neighbouring nodes with lower values. We provide precise formulations of the model classes and prove equivalences between them. Finally we apply the front propagation models on graphs to semi-supervised learning via label propagation and information propagation on trust networks.

MLMar 2, 2021
Wasserstein GANs Work Because They Fail (to Approximate the Wasserstein Distance)

Jan Stanczuk, Christian Etmann, Lisa Maria Kreusser et al.

Wasserstein GANs are based on the idea of minimising the Wasserstein distance between a real and a generated distribution. We provide an in-depth mathematical analysis of differences between the theoretical setup and the reality of training Wasserstein GANs. In this work, we gather both theoretical and empirical evidence that the WGAN loss is not a meaningful approximation of the Wasserstein distance. Moreover, we argue that the Wasserstein distance is not even a desirable loss function for deep generative models, and conclude that the success of Wasserstein GANs can in truth be attributed to a failure to approximate the Wasserstein distance.

APJul 24, 2020
On anisotropic diffusion equations for label propagation

Lisa Maria Kreusser, Marie-Therese Wolfram

In many problems in data classification one wishes to assign labels to points in a point cloud with a certain number of them being already correctly labeled. In this paper, we propose a microscopic ODE approach, in which information about correct labels is propagated to neighboring points. Its dynamics are based on alignment mechanisms, which are commonly used in large interacting agent systems in consensus formation. We derive the respective continuum description, which corresponds to an anisotropic diffusion equation with reaction term. Solutions of the continuum model on the bounded domain inherit certain properties of the underlying point cloud. We discuss these analytic properties and exemplify the results with micro- and macroscopic simulations.

LGJan 21, 2019
A Deterministic Gradient-Based Approach to Avoid Saddle Points

Lisa Maria Kreusser, Stanley J. Osher, Bao Wang

Loss functions with a large number of saddle points are one of the major obstacles for training modern machine learning models efficiently. First-order methods such as gradient descent are usually the methods of choice for training machine learning models. However, these methods converge to saddle points for certain choices of initial guesses. In this paper, we propose a modification of the recently proposed Laplacian smoothing gradient descent [Osher et al., arXiv:1806.06317], called modified Laplacian smoothing gradient descent (mLSGD), and demonstrate its potential to avoid saddle points without sacrificing the convergence rate. Our analysis is based on the attraction region, formed by all starting points for which the considered numerical scheme converges to a saddle point. We investigate the attraction region's dimension both analytically and numerically. For a canonical class of quadratic functions, we show that the dimension of the attraction region for mLSGD is floor((n-1)/2), and hence it is significantly smaller than that of the gradient descent whose dimension is n-1.