Oliver J. Sutton

LG
h-index25
10papers
22citations
Novelty57%
AI Score36

10 Papers

NAApr 24, 2017
A posteriori error estimates for the virtual element method

Andrea Cangiani, Emmanuil H. Georgoulis, Tristan Pryer et al.

An posteriori error analysis for the virtual element method (VEM) applied to general elliptic problems is presented. The resulting error estimator is of residual-type and applies on very general polygonal/polyhedral meshes. The estimator is fully computable as it relies only on quantities available from the VEM solution, namely its degrees of freedom and element-wise polynomial projection. Upper and lower bounds of the error estimator with respect to the VEM approximation error are proven. The error estimator is used to drive adaptive mesh refinement in a number of test problems. Mesh adaptation is particularly simple to implement since elements with consecutive co-planar edges/faces are allowed and, therefore, locally adapted meshes do not require any local mesh post-processing.

LGSep 7, 2023
How adversarial attacks can disrupt seemingly stable accurate classifiers

Oliver J. Sutton, Qinghua Zhou, Ivan Y. Tyukin et al.

Adversarial attacks dramatically change the output of an otherwise accurate learning system using a seemingly inconsequential modification to a piece of input data. Paradoxically, empirical evidence indicates that even systems which are robust to large random perturbations of the input data remain susceptible to small, easily constructed, adversarial perturbations of their inputs. Here, we show that this may be seen as a fundamental feature of classifiers working with high dimensional input data. We introduce a simple generic and generalisable framework for which key behaviours observed in practical systems arise with high probability -- notably the simultaneous susceptibility of the (otherwise accurate) model to easily constructed adversarial attacks, and robustness to random perturbations of the input data. We confirm that the same phenomena are directly observed in practical neural networks trained on standard image classification problems, where even large additive random noise fails to trigger the adversarial instability of the network. A surprising takeaway is that even small margins separating a classifier's decision surface from training and testing data can hide adversarial susceptibility from being detected using randomly sampled perturbations. Counterintuitively, using additive noise during training or testing is therefore inefficient for eradicating or detecting adversarial examples, and more demanding adversarial training is required.

PESep 28, 2017
Revealing new dynamical patterns in a reaction-diffusion model with cyclic competition via a novel computational framework

Andrea Cangiani, Emmanuil H. Georgoulis, Andrew Yu. Morozov et al.

Understanding how patterns and travelling waves form in chemical and biological reaction-diffusion models is an area which has been widely researched, yet is still experiencing fast development. Surprisingly enough, we still do not have a clear understanding about all possible types of dynamical regimes in classical reaction-diffusion models such as Lotka-Volterra competition models with spatial dependence. In this work, we demonstrate some new types of wave propagation and pattern formation in a classical three species cyclic competition model with spatial diffusion, which have been so far missed in the literature. These new patterns are characterised by a high regularity in space, but are different from patterns previously known to exist in reaction-diffusion models, and may have important applications in improving our understanding of biological pattern formation and invasion theory. Finding these new patterns is made technically possible by using an automatic adaptive finite element method driven by a novel a posteriori error estimate which is proven to provide a reliable bound for the error of the numerical method. We demonstrate how this numerical framework allows us to easily explore the dynamical patterns both in two and three spatial dimensions.

NAJun 21, 2016
The Virtual Element Method in 50 lines of MATLAB

Oliver J. Sutton

We present a 50-line MATLAB implementation of the lowest order virtual element method for the two-dimensional Poisson problem on general polygonal meshes. The matrix formulation of the method is discussed, along with the structure of the overall algorithm for computing with a virtual element method. The purpose of this software is primarily educational, to demonstrate how the key components of the method can be translated into code.

LGOct 10, 2023
Relative intrinsic dimensionality is intrinsic to learning

Oliver J. Sutton, Qinghua Zhou, Alexander N. Gorban et al.

High dimensional data can have a surprising property: pairs of data points may be easily separated from each other, or even from arbitrary subsets, with high probability using just simple linear classifiers. However, this is more of a rule of thumb than a reliable property as high dimensionality alone is neither necessary nor sufficient for successful learning. Here, we introduce a new notion of the intrinsic dimension of a data distribution, which precisely captures the separability properties of the data. For this intrinsic dimension, the rule of thumb above becomes a law: high intrinsic dimension guarantees highly separable data. We extend this notion to that of the relative intrinsic dimension of two data distributions, which we show provides both upper and lower bounds on the probability of successfully learning and generalising in a binary classification problem

LGNov 7, 2022
Towards a mathematical understanding of learning from few examples with nonlinear feature maps

Oliver J. Sutton, Alexander N. Gorban, Ivan Y. Tyukin

We consider the problem of data classification where the training set consists of just a few data points. We explore this phenomenon mathematically and reveal key relationships between the geometry of an AI model's feature space, the structure of the underlying data distributions, and the model's generalisation capabilities. The main thrust of our analysis is to reveal the influence on the model's generalisation capabilities of nonlinear feature transformations mapping the original data into high, and possibly infinite, dimensional spaces.

NAMar 8, 2018
Long time $L^\infty(L^2)$ a posteriori error estimates for fully discrete parabolic problems

Oliver J. Sutton

Computable estimates for the error of finite element discretisations of parabolic problems in the $L^\infty(0,T; L^2)$ norm are developed, which exhibit constant effectivities (the ratio of the estimated error to the true error) with respect to the simulation time. These estimates, which are of optimal order, represent a significant advantage for long-time simulations, and are derived using energy techniques based on elliptic reconstructions. The effectivities of previous optimal order error estimates in this norm derived using energy techniques are shown numerically to grow either in proportion to the simulation duration or its square root, a key disadvantage compared with earlier estimators derived using parabolic duality arguments. The new estimates form a continuous family, almost all of which are new, reproducing certain familiar energy-based estimates well suited for short-time simulations and not available through the parabolic duality framework. For clarity, we demonstrate the technique applied to a linear parabolic problem discretised using standard conforming finite element methods in space coupled with backward Euler and Crank-Nicolson time discretisations, although it can be applied much more widely.

AIJun 18, 2024Code
Stealth edits to large language models

Oliver J. Sutton, Qinghua Zhou, Wei Wang et al.

We reveal the theoretical foundations of techniques for editing large language models, and present new methods which can do so without requiring retraining. Our theoretical insights show that a single metric (a measure of the intrinsic dimension of the model's features) can be used to assess a model's editability and reveals its previously unrecognised susceptibility to malicious stealth attacks. This metric is fundamental to predicting the success of a variety of editing approaches, and reveals new bridges between disparate families of editing methods. We collectively refer to these as stealth editing methods, because they directly update a model's weights to specify its response to specific known hallucinating prompts without affecting other model behaviour. By carefully applying our theoretical insights, we are able to introduce a new jet-pack network block which is optimised for highly selective model editing, uses only standard network operations, and can be inserted into existing networks. We also reveal the vulnerability of language models to stealth attacks: a small change to a model's weights which fixes its response to a single attacker-chosen prompt. Stealth attacks are computationally simple, do not require access to or knowledge of the model's training data, and therefore represent a potent yet previously unrecognised threat to redistributed foundation models. Extensive experimental results illustrate and support our methods and their theoretical underpinnings. Demos and source code are available at https://github.com/qinghua-zhou/stealth-edits.

CVJul 29, 2025
Staining and locking computer vision models without retraining

Oliver J. Sutton, Qinghua Zhou, George Leete et al.

We introduce new methods of staining and locking computer vision models, to protect their owners' intellectual property. Staining, also known as watermarking, embeds secret behaviour into a model which can later be used to identify it, while locking aims to make a model unusable unless a secret trigger is inserted into input images. Unlike existing methods, our algorithms can be used to stain and lock pre-trained models without requiring fine-tuning or retraining, and come with provable, computable guarantees bounding their worst-case false positive rates. The stain and lock are implemented by directly modifying a small number of the model's weights and have minimal impact on the (unlocked) model's performance. Locked models are unlocked by inserting a small `trigger patch' into the corner of the input image. We present experimental results showing the efficacy of our methods and demonstrating their practical performance on a variety of computer vision models.

LGJan 31, 2024
Weakly Supervised Learners for Correction of AI Errors with Provable Performance Guarantees

Ivan Y. Tyukin, Tatiana Tyukina, Daniel van Helden et al.

We present a new methodology for handling AI errors by introducing weakly supervised AI error correctors with a priori performance guarantees. These AI correctors are auxiliary maps whose role is to moderate the decisions of some previously constructed underlying classifier by either approving or rejecting its decisions. The rejection of a decision can be used as a signal to suggest abstaining from making a decision. A key technical focus of the work is in providing performance guarantees for these new AI correctors through bounds on the probabilities of incorrect decisions. These bounds are distribution agnostic and do not rely on assumptions on the data dimension. Our empirical example illustrates how the framework can be applied to improve the performance of an image classifier in a challenging real-world task where training data are scarce.