Teo Deveney

LG
6papers
66citations
Novelty64%
AI Score31

6 Papers

LGJul 5, 2024
G-Adaptivity: optimised graph-based mesh relocation for finite element methods

James Rowbottom, Georg Maierhofer, Teo Deveney et al.

We present a novel, and effective, approach to achieve optimal mesh relocation in finite element methods (FEMs). The cost and accuracy of FEMs is critically dependent on the choice of mesh points. Mesh relocation (r-adaptivity) seeks to optimise the mesh geometry to obtain the best solution accuracy at given computational budget. Classical r-adaptivity relies on the solution of a separate nonlinear "meshing" PDE to determine mesh point locations. This incurs significant cost at remeshing, and relies on estimates that relate interpolation- and FEM-error. Recent machine learning approaches have focused on the construction of fast surrogates for such classical methods. Instead, our new approach trains a graph neural network (GNN) to determine mesh point locations by directly minimising the FE solution error from the PDE system Firedrake to achieve higher solution accuracy. Our GNN architecture closely aligns the mesh solution space to that of classical meshing methodologies, thus replacing classical estimates for optimality with a learnable strategy. This allows for rapid and robust training and results in an extremely efficient and effective GNN approach to online r-adaptivity. Our method outperforms both classical, and prior ML, approaches to r-adaptive meshing. In particular, it achieves lower FE solution error, whilst retaining the significant speed-up over classical methods observed in prior ML work.

LGDec 23, 2022
Your diffusion model secretly knows the dimension of the data manifold

Jan Stanczuk, Georgios Batzolis, Teo Deveney et al.

In this work, we propose a novel framework for estimating the dimension of the data manifold using a trained diffusion model. A diffusion model approximates the score function i.e. the gradient of the log density of a noise-corrupted version of the target distribution for varying levels of corruption. We prove that, if the data concentrates around a manifold embedded in the high-dimensional ambient space, then as the level of corruption decreases, the score function points towards the manifold, as this direction becomes the direction of maximal likelihood increase. Therefore, for small levels of corruption, the diffusion model provides us with access to an approximation of the normal bundle of the data manifold. This allows us to estimate the dimension of the tangent space, thus, the intrinsic dimension of the data manifold. To the best of our knowledge, our method is the first estimator of the data manifold dimension based on diffusion models and it outperforms well established statistical estimators in controlled experiments on both Euclidean and image data.

LGNov 27, 2023
Closing the ODE-SDE gap in score-based diffusion models through the Fokker-Planck equation

Teo Deveney, Jan Stanczuk, Lisa Maria Kreusser et al.

Score-based diffusion models have emerged as one of the most promising frameworks for deep generative modelling, due to their state-of-the art performance in many generation tasks while relying on mathematical foundations such as stochastic differential equations (SDEs) and ordinary differential equations (ODEs). Empirically, it has been reported that ODE based samples are inferior to SDE based samples. In this paper we rigorously describe the range of dynamics and approximations that arise when training score-based diffusion models, including the true SDE dynamics, the neural approximations, the various approximate particle dynamics that result, as well as their associated Fokker--Planck equations and the neural network approximations of these Fokker--Planck equations. We systematically analyse the difference between the ODE and SDE dynamics of score-based diffusion models, and link it to an associated Fokker--Planck equation. We derive a theoretical upper bound on the Wasserstein 2-distance between the ODE- and SDE-induced distributions in terms of a Fokker--Planck residual. We also show numerically that conventional score-based diffusion models can exhibit significant differences between ODE- and SDE-induced distributions which we demonstrate using explicit comparisons. Moreover, we show numerically that reducing the Fokker--Planck residual by adding it as an additional regularisation term leads to closing the gap between ODE- and SDE-induced distributions. Our experiments suggest that this regularisation can improve the distribution generated by the ODE, however that this can come at the cost of degraded SDE sample quality.

NAApr 5, 2022
Deep surrogate accelerated delayed-acceptance HMC for Bayesian inference of spatio-temporal heat fluxes in rotating disc systems

Teo Deveney, Eike Mueller, Tony Shardlow

We introduce a deep learning accelerated methodology to solve PDE-based Bayesian inverse problems with guaranteed accuracy. This is motivated by the ill-posed problem of inferring a spatio-temporal heat-flux parameter known as the Biot number given temperature data, however the methodology is generalisable to other settings. To accelerate Bayesian inference, we develop a novel training scheme that uses data to adaptively train a neural-network surrogate simulating the parametric forward model. By simultaneously identifying an approximate posterior distribution over the Biot number, and weighting a physics-informed training loss according to this, our approach approximates forward and inverse solution together without any need for external solves. Using a random Chebyshev series, we outline how to approximate a Gaussian process prior, and using the surrogate we apply Hamiltonian Monte Carlo (HMC) to sample from the posterior distribution. We derive convergence of the surrogate posterior to the true posterior distribution in the Hellinger metric as our adaptive loss approaches zero. Additionally, we describe how this surrogate-accelerated HMC approach can be combined with traditional PDE solvers in a delayed-acceptance scheme to a-priori control the posterior accuracy. This overcomes a major limitation of deep learning-based surrogate approaches, which do not achieve guaranteed accuracy a-priori due to their non-convex training. Biot number calculations are involved in turbo-machinery design, which is safety critical and highly regulated, therefore it is important that our results have such mathematical guarantees. Our approach achieves fast mixing in high dimensions whilst retaining the convergence guarantees of a traditional PDE solver, and without the burden of evaluating this solver for proposals that are likely to be rejected. Numerical results are given using real and simulated data.

LGJul 2, 2024
Equidistribution-based training of Free Knot Splines and ReLU Neural Networks

Simone Appella, Simon Arridge, Chris Budd et al.

We consider the problem of univariate nonlinear function approximation using shallow neural networks (NN) with a rectified linear unit (ReLU) activation function. We show that the $L_2$ based approximation problem is ill-conditioned and the behaviour of optimisation algorithms used in training these networks degrades rapidly as the width of the network increases. This can lead to significantly poorer approximation in practice than expected from the theoretical expressivity of the ReLU architecture and traditional methods such as univariate Free Knot Splines (FKS). Univariate shallow ReLU NNs and FKS span the same function space, and thus have the same theoretical expressivity. However, the FKS representation remains well-conditioned as the number of knots increases. We leverage the theory of optimal piecewise linear interpolants to improve the training procedure for ReLU NNs. Using the equidistribution principle, we propose a two-level procedure for training the FKS by first solving the nonlinear problem of finding the optimal knot locations of the interpolating FKS, and then determine the optimal weights and knots of the FKS by solving a nearly linear, well-conditioned problem. The training of the FKS gives insights into how we can train a ReLU NN effectively, with an equally accurate approximation. We combine the training of the ReLU NN with an equidistribution-based loss to find the breakpoints of the ReLU functions. This is then combined with preconditioning the ReLU NN approximation to find the scalings of the ReLU functions. This fast, well-conditioned and reliable method finds an accurate shallow ReLU NN approximation to a univariate target function. We test this method on a series of regular, singular, and rapidly varying target functions and obtain good results, realising the expressivity of the shallow ReLU network in all cases. We then extend our results to deeper networks.

NAOct 3, 2019
A deep surrogate approach to efficient Bayesian inversion in PDE and integral equation models

Teo Deveney, Eike Mueller, Tony Shardlow

We investigate a deep learning approach to efficiently perform Bayesian inference in partial differential equation (PDE) and integral equation models over potentially high-dimensional parameter spaces. The contributions of this paper are two-fold; the first is the introduction of a neural network approach to approximating the solutions of Fredholm and Volterra integral equations of the first and second kind. The second is the development of a new, efficient deep learning-based method for Bayesian inversion applied to problems that can be described by PDEs or integral equations. To achieve this we introduce a surrogate model, and demonstrate how this allows efficient sampling from a Bayesian posterior distribution in which the likelihood depends on the solutions of PDEs or integral equations. Our method relies on the direct approximation of parametric solutions by neural networks, without need of traditional numerical solves. This deep learning approach allows the accurate and efficient approximation of parametric solutions in significantly higher dimensions than is possible using classical discretisation schemes. Since the approximated solutions can be cheaply evaluated, the solutions of Bayesian inverse problems over large parameter spaces are efficient using Markov chain Monte Carlo. We demonstrate the performance of our method using two real-world examples; these include Bayesian inference in the PDE and integral equation case for an example from electrochemistry, and Bayesian inference of a function-valued heat-transfer parameter with applications in aviation.