LG DS NAOct 5, 2022

Dynamical systems' based neural networks

Elena Celledoni, Davide Murari, Brynjulf Owren, Carola-Bibiane Schönlieb, Ferdia Sherry

arXiv:2210.02373v211.113 citationsh-index: 49Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for more interpretable and robust neural network architectures in machine learning, particularly for adversarial defense, though it is incremental in applying dynamical systems theory to network design.

The paper tackles the problem of designing neural networks with better mathematical structure and theoretical understanding by constructing them from non-autonomous ODEs using structure-preserving discretization, resulting in expressive and robust 1-Lipschitz architectures that show effectiveness on CIFAR-10 and CIFAR-100 datasets against adversarial attacks.

Neural networks have gained much interest because of their effectiveness in many applications. However, their mathematical properties are generally not well understood. If there is some underlying geometric structure inherent to the data or to the function to approximate, it is often desirable to take this into account in the design of the neural network. In this work, we start with a non-autonomous ODE and build neural networks using a suitable, structure-preserving, numerical time-discretisation. The structure of the neural network is then inferred from the properties of the ODE vector field. Besides injecting more structure into the network architectures, this modelling procedure allows a better theoretical understanding of their behaviour. We present two universal approximation results and demonstrate how to impose some particular properties on the neural networks. A particular focus is on 1-Lipschitz architectures including layers that are not 1-Lipschitz. These networks are expressive and robust against adversarial attacks, as shown for the CIFAR-10 and CIFAR-100 datasets.

View on arXiv PDF Code

Similar