Elena Celledoni

LG
h-index29
17papers
258citations
Novelty40%
AI Score36

17 Papers

NANov 14, 2012
Geometric properties of Kahan's method

Elena Celledoni, Robert I McLachlan, Brynjulf Owren et al.

We show that Kahan's discretization of quadratic vector fields is equivalent to a Runge--Kutta method. When the vector field is Hamiltonian on either a symplectic vector space or a Poisson vector space with constant Poisson structure, the map determined by this discretization has a conserved modified Hamiltonian and an invariant measure, a combination previously unknown amongst Runge--Kutta methods applied to nonlinear vector fields. This produces large classes of integrable rational mappings in two and three dimensions, explaining some of the integrable cases that were previously known.

NANov 27, 2012
An introduction to Lie group integrators -- basics, new developments and applications

Elena Celledoni, Håkon Marthinsen, Brynjulf Owren

We give a short and elementary introduction to Lie group methods. A selection of applications of Lie group integrators are discussed. Finally, a family of symplectic integrators on cotangent bundles of Lie groups is presented and the notion of discrete gradient methods is generalised to Lie groups.

LGOct 5, 2022
Dynamical systems' based neural networks

Elena Celledoni, Davide Murari, Brynjulf Owren et al.

Neural networks have gained much interest because of their effectiveness in many applications. However, their mathematical properties are generally not well understood. If there is some underlying geometric structure inherent to the data or to the function to approximate, it is often desirable to take this into account in the design of the neural network. In this work, we start with a non-autonomous ODE and build neural networks using a suitable, structure-preserving, numerical time-discretisation. The structure of the neural network is then inferred from the properties of the ODE vector field. Besides injecting more structure into the network architectures, this modelling procedure allows a better theoretical understanding of their behaviour. We present two universal approximation results and demonstrate how to impose some particular properties on the neural networks. A particular focus is on 1-Lipschitz architectures including layers that are not 1-Lipschitz. These networks are expressive and robust against adversarial attacks, as shown for the CIFAR-10 and CIFAR-100 datasets.

NAMar 15, 2012
The minimal stage, energy preserving Runge-Kutta method for polynomial Hamiltonian systems is the Averaged Vector Field method

Elena Celledoni, Brynjulf Owren, Yajuan Sun

No Runge-Kutta method can be energy preserving for all Hamiltonian systems. But for problems in which the Hamiltonian is a polynomial, the Averaged Vector Field (AVF) method can be interpreted as a Runge-Kutta method whose weights $b_i$ and abscissae $c_i$ represent a quadrature rule of degree at least that of the Hamiltonian. We prove that when the number of stages is minimal, the Runge-Kutta scheme must in fact be identical to the AVF scheme.

NAMar 16, 2015
High order semi-Lagrangian methods for the incompressible Navier-Stokes equations

Elena Celledoni, Bawfeh Kingsley Kometa, Olivier Verdier

We propose a class of semi-Lagrangian methods of high approximation order in space and time, based on spectral element space discretizations and exponential integrators of Runge-Kutta type. We discuss the extension of these methods to the Navier-Stokes equations, and their implementation using projections. Semi-Lagrangian methods up to order three are implemented and tested on various examples. The good performance of the methods for convection-dominated problems is demonstrated with numerical experiments.

LGJun 29, 2023
Designing Stable Neural Networks using Convex Analysis and ODEs

Ferdia Sherry, Elena Celledoni, Matthias J. Ehrhardt et al.

Motivated by classical work on the numerical integration of ordinary differential equations we present a ResNet-styled neural network architecture that encodes non-expansive (1-Lipschitz) operators, as long as the spectral norms of the weights are appropriately constrained. This is to be contrasted with the ordinary ResNet architecture which, even if the spectral norms of the weights are constrained, has a Lipschitz constant that, in the worst case, grows exponentially with the depth of the network. Further analysis of the proposed architecture shows that the spectral norms of the weights can be further constrained to ensure that the network is an averaged operator, making it a natural candidate for a learned denoiser in Plug-and-Play algorithms. Using a novel adaptive way of enforcing the spectral norm constraints, we show that, even with these constraints, it is possible to train performant networks. The proposed architecture is applied to the problem of adversarially robust image classification, to image denoising, and finally to the inverse problem of deblurring.

OCJul 22, 2022
Deep neural networks on diffeomorphism groups for optimal shape reparameterization

Elena Celledoni, Helge Glöckner, Jørgen Riseth et al.

One of the fundamental problems in shape analysis is to align curves or surfaces before computing geodesic distances between their shapes. Finding the optimal reparametrization realizing this alignment is a computationally demanding task, typically done by solving an optimization problem on the diffeomorphism group. In this paper, we propose an algorithm for constructing approximations of orientation-preserving diffeomorphisms by composition of elementary diffeomorphisms. The algorithm is implemented using PyTorch, and is applicable for both unparametrized curves and surfaces. Moreover, we show universal approximation properties for the constructed architectures, and obtain bounds for the Lipschitz constants of the resulting diffeomorphisms.

LGJun 6, 2023
Learning Dynamical Systems from Noisy Data with Inverse-Explicit Integrators

Håkon Noren, Sølve Eidnes, Elena Celledoni

We introduce the mean inverse integrator (MII), a novel approach to increase the accuracy when training neural networks to approximate vector fields of dynamical systems from noisy data. This method can be used to average multiple trajectories obtained by numerical integrators such as Runge-Kutta methods. We show that the class of mono-implicit Runge-Kutta methods (MIRK) has particular advantages when used in connection with MII. When training vector field approximations, explicit expressions for the loss functions are obtained when inserting the training data in the MIRK formulae, unlocking symmetric and high-order integrators that would otherwise be implicit for initial value problems. The combined approach of applying MIRK within MII yields a significantly lower error compared to the plain use of the numerical integrator without averaging the trajectories. This is demonstrated with experiments using data from several (chaotic) Hamiltonian systems. Additionally, we perform a sensitivity analysis of the loss functions under normally distributed perturbations, supporting the favorable performance of MII.

LGOct 27, 2025Code
Mixed Precision Training of Neural ODEs

Elena Celledoni, Brynjulf Owren, Lars Ruthotto et al.

Exploiting low-precision computations has become a standard strategy in deep learning to address the growing computational costs imposed by ever larger models and datasets. However, naively performing all computations in low precision can lead to roundoff errors and instabilities. Therefore, mixed precision training schemes usually store the weights in high precision and use low-precision computations only for whitelisted operations. Despite their success, these principles are currently not reliable for training continuous-time architectures such as neural ordinary differential equations (Neural ODEs). This paper presents a mixed precision training framework for neural ODEs, combining explicit ODE solvers with a custom backpropagation scheme, and demonstrates its effectiveness across a range of learning tasks. Our scheme uses low-precision computations for evaluating the velocity, parameterized by the neural network, and for storing intermediate states, while stability is provided by a custom dynamic adjoint scaling and by accumulating the solution and gradients in higher precision. These contributions address two key challenges in training neural ODE: the computational cost of repeated network evaluations and the growth of memory requirements with the number of time steps or layers. Along with the paper, we publish our extendable, open-source PyTorch package rampde, whose syntax resembles that of leading packages to provide a drop-in replacement in existing codes. We demonstrate the reliability and effectiveness of our scheme using challenging test cases and on neural ODE applications in image classification and generative models, achieving approximately 50% memory reduction and up to 2x speedup while maintaining accuracy comparable to single-precision training.

NAMar 19, 2025
Approximation properties of neural ODEs

Arturo De Marinis, Davide Murari, Elena Celledoni et al.

We study the approximation properties of shallow neural networks whose activation function is defined as the flow map of a neural ordinary differential equation (neural ODE) at the final time of the integration interval. We prove the universal approximation property (UAP) of such shallow neural networks in the space of continuous functions. Furthermore, we investigate the approximation properties of shallow neural networks whose parameters satisfy specific constraints. In particular, we constrain the Lipschitz constant of the neural ODE's flow map and the norms of the weights to increase the network's stability. We prove that the UAP holds if we consider either constraint independently. When both are enforced, there is a loss of expressiveness, and we derive approximation bounds that quantify how accurately such a constrained network can approximate a continuous function.

NAMay 1, 2023
Predictions Based on Pixel Data: Insights from PDEs and Finite Differences

Elena Celledoni, James Jackaman, Davide Murari et al.

As supported by abundant experimental evidence, neural networks are state-of-the-art for many approximation tasks in high-dimensional spaces. Still, there is a lack of a rigorous theoretical understanding of what they can approximate, at which cost, and at which accuracy. One network architecture of practical use, especially for approximation tasks involving images, is (residual) convolutional networks. However, due to the locality of the linear operators involved in these networks, their analysis is more complicated than that of fully connected neural networks. This paper deals with approximation of time sequences where each observation is a matrix. We show that with relatively small networks, we can represent exactly a class of numerical discretizations of PDEs based on the method of lines. We constructively derive these results by exploiting the connections between discrete convolution and finite difference operators. Our network architecture is inspired by those typically adopted in the approximation of time sequences. We support our theoretical results with numerical experiments simulating the linear advection, heat, and Fisher equations.

NAJan 31, 2022
Learning Hamiltonians of constrained mechanical systems

Elena Celledoni, Andrea Leone, Davide Murari et al.

Recently, there has been an increasing interest in modelling and computation of physical systems with neural networks. Hamiltonian systems are an elegant and compact formalism in classical mechanics, where the dynamics is fully determined by one scalar function, the Hamiltonian. The solution trajectories are often constrained to evolve on a submanifold of a linear vector space. In this work, we propose new approaches for the accurate approximation of the Hamiltonian function of constrained mechanical systems given sample data information of their solutions. We focus on the importance of the preservation of the constraints in the learning strategy by using both explicit Lie group integrators and other classical schemes.

LGFeb 23, 2021
Equivariant neural networks for inverse problems

Elena Celledoni, Matthias J. Ehrhardt, Christian Etmann et al.

In recent years the use of convolutional layers to encode an inductive bias (translational equivariance) in neural networks has proven to be a very fruitful idea. The successes of this approach have motivated a line of research into incorporating other symmetries into deep learning methods, in the form of group equivariant convolutional neural networks. Much of this work has been focused on roto-translational symmetry of $\mathbf R^d$, but other examples are the scaling symmetry of $\mathbf R^d$ and rotational symmetry of the sphere. In this work, we demonstrate that group equivariant convolutional operations can naturally be incorporated into learned reconstruction methods for inverse problems that are motivated by the variational regularisation approach. Indeed, if the regularisation functional is invariant under a group symmetry, the corresponding proximal operator will satisfy an equivariance property with respect to the same group symmetry. As a result of this observation, we design learned iterative methods in which the proximal operators are modelled as group equivariant convolutional neural networks. We use roto-translationally equivariant operations in the proposed methodology and apply it to the problems of low-dose computerised tomography reconstruction and subsampled magnetic resonance imaging reconstruction. The proposed methodology is demonstrated to improve the reconstruction quality of a learned reconstruction method with a little extra computational cost at training time but without any extra cost at test time.

LGJun 5, 2020
Structure preserving deep learning

Elena Celledoni, Matthias J. Ehrhardt, Christian Etmann et al.

Over the past few years, deep learning has risen to the foreground as a topic of massive interest, mainly as a result of successes obtained in solving large-scale image processing tasks. There are multiple challenging mathematical problems involved in applying deep learning: most deep learning methods require the solution of hard optimisation problems, and a good understanding of the tradeoff between computational effort, amount of data and model complexity is required to successfully design a deep learning approach for a given problem. A large amount of progress made in deep learning has been based on heuristic explorations, but there is a growing effort to mathematically understand the structure in existing deep learning methods and to systematically design new deep learning methods to preserve certain types of structure in deep learning. In this article, we review a number of these directions: some deep neural networks can be understood as discretisations of dynamical systems, neural networks can be designed to have desirable properties such as invertibility or group equivariance, and new algorithmic frameworks based on conformal Hamiltonian systems and Riemannian manifolds to solve the optimisation problems have been proposed. We conclude our review of each of these topics by discussing some open problems that we consider to be interesting directions for future research.

DGJun 14, 2019
Signatures in Shape Analysis: an Efficient Approach to Motion Identification

Elena Celledoni, Pål Erik Lystad, Nikolas Tapia

Signatures provide a succinct description of certain features of paths in a reparametrization invariant way. We propose a method for classifying shapes based on signatures, and compare it to current approaches based on the SRV transform and dynamic programming.

OCApr 11, 2019
Deep learning as optimal control problems: models and numerical methods

Martin Benning, Elena Celledoni, Matthias J. Ehrhardt et al.

We consider recent work of Haber and Ruthotto 2017 and Chang et al. 2018, where deep learning neural networks have been interpreted as discretisations of an optimal control problem subject to an ordinary differential equation constraint. We review the first order conditions for optimality, and the conditions ensuring optimality after discretisation. This leads to a class of algorithms for solving the discrete optimal control problem which guarantee that the corresponding discrete necessary conditions for optimality are fulfilled. The differential equation setting lends itself to learning additional parameters such as the time discretisation. We explore this extension alongside natural constraints (e.g. time steps lie in a simplex). We compare these deep learning algorithms numerically in terms of induced flow and generalisation ability.