NANov 14, 2012
Geometric properties of Kahan's methodElena Celledoni, Robert I McLachlan, Brynjulf Owren et al.
We show that Kahan's discretization of quadratic vector fields is equivalent to a Runge--Kutta method. When the vector field is Hamiltonian on either a symplectic vector space or a Poisson vector space with constant Poisson structure, the map determined by this discretization has a conserved modified Hamiltonian and an invariant measure, a combination previously unknown amongst Runge--Kutta methods applied to nonlinear vector fields. This produces large classes of integrable rational mappings in two and three dimensions, explaining some of the integrable cases that were previously known.
NANov 27, 2012
An introduction to Lie group integrators -- basics, new developments and applicationsElena Celledoni, Håkon Marthinsen, Brynjulf Owren
We give a short and elementary introduction to Lie group methods. A selection of applications of Lie group integrators are discussed. Finally, a family of symplectic integrators on cotangent bundles of Lie groups is presented and the notion of discrete gradient methods is generalised to Lie groups.
NAJun 8, 2011
Preserving multiple first integrals by discrete gradientsMorten Dahlby, Brynjulf Owren, Takaharu Yaguchi
We consider systems of ordinary differential equations with known first integrals. The notion of a discrete tangent space is introduced as the orthogonal complement of an arbitrary set of discrete gradients. Integrators which exactly conserve all the first integrals simultaneously are then defined. In both cases we start from an arbitrary method of a prescribed order (say, a Runge-Kutta scheme) and modify it using two approaches: one based on projection and one based one local coordinates. The methods are tested on the Kepler problem.
LGOct 5, 2022
Dynamical systems' based neural networksElena Celledoni, Davide Murari, Brynjulf Owren et al.
Neural networks have gained much interest because of their effectiveness in many applications. However, their mathematical properties are generally not well understood. If there is some underlying geometric structure inherent to the data or to the function to approximate, it is often desirable to take this into account in the design of the neural network. In this work, we start with a non-autonomous ODE and build neural networks using a suitable, structure-preserving, numerical time-discretisation. The structure of the neural network is then inferred from the properties of the ODE vector field. Besides injecting more structure into the network architectures, this modelling procedure allows a better theoretical understanding of their behaviour. We present two universal approximation results and demonstrate how to impose some particular properties on the neural networks. A particular focus is on 1-Lipschitz architectures including layers that are not 1-Lipschitz. These networks are expressive and robust against adversarial attacks, as shown for the CIFAR-10 and CIFAR-100 datasets.
NAMar 15, 2012
The minimal stage, energy preserving Runge-Kutta method for polynomial Hamiltonian systems is the Averaged Vector Field methodElena Celledoni, Brynjulf Owren, Yajuan Sun
No Runge-Kutta method can be energy preserving for all Hamiltonian systems. But for problems in which the Hamiltonian is a polynomial, the Averaged Vector Field (AVF) method can be interpreted as a Runge-Kutta method whose weights $b_i$ and abscissae $c_i$ represent a quadrature rule of degree at least that of the Hamiltonian. We prove that when the number of stages is minimal, the Runge-Kutta scheme must in fact be identical to the AVF scheme.
LGJun 29, 2023
Designing Stable Neural Networks using Convex Analysis and ODEsFerdia Sherry, Elena Celledoni, Matthias J. Ehrhardt et al.
Motivated by classical work on the numerical integration of ordinary differential equations we present a ResNet-styled neural network architecture that encodes non-expansive (1-Lipschitz) operators, as long as the spectral norms of the weights are appropriately constrained. This is to be contrasted with the ordinary ResNet architecture which, even if the spectral norms of the weights are constrained, has a Lipschitz constant that, in the worst case, grows exponentially with the depth of the network. Further analysis of the proposed architecture shows that the spectral norms of the weights can be further constrained to ensure that the network is an averaged operator, making it a natural candidate for a learned denoiser in Plug-and-Play algorithms. Using a novel adaptive way of enforcing the spectral norm constraints, we show that, even with these constraints, it is possible to train performant networks. The proposed architecture is applied to the problem of adversarially robust image classification, to image denoising, and finally to the inverse problem of deblurring.
NAFeb 19, 2013
Preserving first integrals with symmetric Lie group methodsElena Celledoni, Brynjulf Owren
The discrete gradient approach is generalized to yield integral preserving methods for differential equations in Lie groups.
LGOct 27, 2025Code
Mixed Precision Training of Neural ODEsElena Celledoni, Brynjulf Owren, Lars Ruthotto et al.
Exploiting low-precision computations has become a standard strategy in deep learning to address the growing computational costs imposed by ever larger models and datasets. However, naively performing all computations in low precision can lead to roundoff errors and instabilities. Therefore, mixed precision training schemes usually store the weights in high precision and use low-precision computations only for whitelisted operations. Despite their success, these principles are currently not reliable for training continuous-time architectures such as neural ordinary differential equations (Neural ODEs). This paper presents a mixed precision training framework for neural ODEs, combining explicit ODE solvers with a custom backpropagation scheme, and demonstrates its effectiveness across a range of learning tasks. Our scheme uses low-precision computations for evaluating the velocity, parameterized by the neural network, and for storing intermediate states, while stability is provided by a custom dynamic adjoint scaling and by accumulating the solution and gradients in higher precision. These contributions address two key challenges in training neural ODE: the computational cost of repeated network evaluations and the growth of memory requirements with the number of time steps or layers. Along with the paper, we publish our extendable, open-source PyTorch package rampde, whose syntax resembles that of leading packages to provide a drop-in replacement in existing codes. We demonstrate the reliability and effectiveness of our scheme using challenging test cases and on neural ODE applications in image classification and generative models, achieving approximately 50% memory reduction and up to 2x speedup while maintaining accuracy comparable to single-precision training.
NAMar 19, 2025
Approximation properties of neural ODEsArturo De Marinis, Davide Murari, Elena Celledoni et al.
We study the approximation properties of shallow neural networks whose activation function is defined as the flow map of a neural ordinary differential equation (neural ODE) at the final time of the integration interval. We prove the universal approximation property (UAP) of such shallow neural networks in the space of continuous functions. Furthermore, we investigate the approximation properties of shallow neural networks whose parameters satisfy specific constraints. In particular, we constrain the Lipschitz constant of the neural ODE's flow map and the norms of the weights to increase the network's stability. We prove that the UAP holds if we consider either constraint independently. When both are enforced, there is a loss of expressiveness, and we derive approximation bounds that quantify how accurately such a constrained network can approximate a continuous function.
NAMay 1, 2023
Predictions Based on Pixel Data: Insights from PDEs and Finite DifferencesElena Celledoni, James Jackaman, Davide Murari et al.
As supported by abundant experimental evidence, neural networks are state-of-the-art for many approximation tasks in high-dimensional spaces. Still, there is a lack of a rigorous theoretical understanding of what they can approximate, at which cost, and at which accuracy. One network architecture of practical use, especially for approximation tasks involving images, is (residual) convolutional networks. However, due to the locality of the linear operators involved in these networks, their analysis is more complicated than that of fully connected neural networks. This paper deals with approximation of time sequences where each observation is a matrix. We show that with relatively small networks, we can represent exactly a class of numerical discretizations of PDEs based on the method of lines. We constructively derive these results by exploiting the connections between discrete convolution and finite difference operators. Our network architecture is inspired by those typically adopted in the approximation of time sequences. We support our theoretical results with numerical experiments simulating the linear advection, heat, and Fisher equations.
NAJan 31, 2022
Learning Hamiltonians of constrained mechanical systemsElena Celledoni, Andrea Leone, Davide Murari et al.
Recently, there has been an increasing interest in modelling and computation of physical systems with neural networks. Hamiltonian systems are an elegant and compact formalism in classical mechanics, where the dynamics is fully determined by one scalar function, the Hamiltonian. The solution trajectories are often constrained to evolve on a submanifold of a linear vector space. In this work, we propose new approaches for the accurate approximation of the Hamiltonian function of constrained mechanical systems given sample data information of their solutions. We focus on the importance of the preservation of the constraints in the learning strategy by using both explicit Lie group integrators and other classical schemes.
LGFeb 23, 2021
Equivariant neural networks for inverse problemsElena Celledoni, Matthias J. Ehrhardt, Christian Etmann et al.
In recent years the use of convolutional layers to encode an inductive bias (translational equivariance) in neural networks has proven to be a very fruitful idea. The successes of this approach have motivated a line of research into incorporating other symmetries into deep learning methods, in the form of group equivariant convolutional neural networks. Much of this work has been focused on roto-translational symmetry of $\mathbf R^d$, but other examples are the scaling symmetry of $\mathbf R^d$ and rotational symmetry of the sphere. In this work, we demonstrate that group equivariant convolutional operations can naturally be incorporated into learned reconstruction methods for inverse problems that are motivated by the variational regularisation approach. Indeed, if the regularisation functional is invariant under a group symmetry, the corresponding proximal operator will satisfy an equivariance property with respect to the same group symmetry. As a result of this observation, we design learned iterative methods in which the proximal operators are modelled as group equivariant convolutional neural networks. We use roto-translationally equivariant operations in the proposed methodology and apply it to the problems of low-dose computerised tomography reconstruction and subsampled magnetic resonance imaging reconstruction. The proposed methodology is demonstrated to improve the reconstruction quality of a learned reconstruction method with a little extra computational cost at training time but without any extra cost at test time.
LGJun 5, 2020
Structure preserving deep learningElena Celledoni, Matthias J. Ehrhardt, Christian Etmann et al.
Over the past few years, deep learning has risen to the foreground as a topic of massive interest, mainly as a result of successes obtained in solving large-scale image processing tasks. There are multiple challenging mathematical problems involved in applying deep learning: most deep learning methods require the solution of hard optimisation problems, and a good understanding of the tradeoff between computational effort, amount of data and model complexity is required to successfully design a deep learning approach for a given problem. A large amount of progress made in deep learning has been based on heuristic explorations, but there is a growing effort to mathematically understand the structure in existing deep learning methods and to systematically design new deep learning methods to preserve certain types of structure in deep learning. In this article, we review a number of these directions: some deep neural networks can be understood as discretisations of dynamical systems, neural networks can be designed to have desirable properties such as invertibility or group equivariance, and new algorithmic frameworks based on conformal Hamiltonian systems and Riemannian manifolds to solve the optimisation problems have been proposed. We conclude our review of each of these topics by discussing some open problems that we consider to be interesting directions for future research.
OCApr 11, 2019
Deep learning as optimal control problems: models and numerical methodsMartin Benning, Elena Celledoni, Matthias J. Ehrhardt et al.
We consider recent work of Haber and Ruthotto 2017 and Chang et al. 2018, where deep learning neural networks have been interpreted as discretisations of an optimal control problem subject to an ordinary differential equation constraint. We review the first order conditions for optimality, and the conditions ensuring optimality after discretisation. This leads to a class of algorithms for solving the discrete optimal control problem which guarantee that the corresponding discrete necessary conditions for optimality are fulfilled. The differential equation setting lends itself to learning additional parameters such as the time discretisation. We explore this extension alongside natural constraints (e.g. time steps lie in a simplex). We compare these deep learning algorithms numerically in terms of induced flow and generalisation ability.
NAMay 4, 2011
A general framework for deriving integral preserving numerical methods for PDEsMorten Dahlby, Brynjulf Owren
A general procedure for constructing conservative numerical integrators for time dependent partial differential equations is presented. In particular, linearly implicit methods preserving a time discretised version of the invariant is developed for systems of partial differential equations with polynomial nonlinearities. The framework is rather general and allows for an arbitrary number of dependent and independent variables with derivatives of any order. It is proved formally that second order convergence is obtained. The procedure is applied to a test case and numerical experiments are provided.
NANov 4, 2010
Plane wave stability of some conservative schemes for the cubic Schrödinger equationMorten Dahlby, Brynjulf Owren
The plane wave stability properties of the conservative schemes of Besse and Fei et al. for the cubic Schrödinger equation are analysed. Although the two methods possess many of the same conservation properties, we show that their stability behaviour is very different. An energy preserving generalisation of the Fei method with improved stability is presented.