Ivor Simpson

h-index16

16papers

192citations

Novelty48%

AI Score47

Ranked #30,766 of 194,257 authors (top 16%)#11,025 in CV (top 19%)

16 Papers

5.9CVSep 27, 2023

The Robust Semantic Segmentation UNCV2023 Challenge Results

Xuanlong Yu, Yi Zuo, Zitao Wang et al. · cmu

This paper outlines the winning solutions employed in addressing the MUAD uncertainty quantification challenge held at ICCV 2023. The challenge was centered around semantic segmentation in urban environments, with a particular focus on natural adversarial scenarios. The report presents the results of 19 submitted entries, with numerous techniques drawing inspiration from cutting-edge uncertainty quantification methodologies presented at prominent conferences in the fields of computer vision and machine learning and journals over the past few years. Within this document, the challenge is introduced, shedding light on its purpose and objectives, which primarily revolved around enhancing the robustness of semantic segmentation in urban scenes under varying natural adversarial conditions. The report then delves into the top-performing solutions. Moreover, the document aims to provide a comprehensive overview of the diverse solutions deployed by all participants. By doing so, it seeks to offer readers a deeper insight into the array of strategies that can be leveraged to effectively handle the inherent uncertainties associated with autonomous driving and semantic segmentation, especially within urban environments.

2.7IVMar 11, 2022Code

Flexible Amortized Variational Inference in qBOLD MRI

Ivor J. A. Simpson, Ashley McManamon, Balázs Örzsik et al.

Streamlined qBOLD acquisitions enable experimentally straightforward observations of brain oxygen metabolism. $R_2^\prime$ maps are easily inferred; however, the Oxygen extraction fraction (OEF) and deoxygenated blood volume (DBV) are more ambiguously determined from the data. As such, existing inference methods tend to yield very noisy and underestimated OEF maps, while overestimating DBV. This work describes a novel probabilistic machine learning approach that can infer plausible distributions of OEF and DBV. Initially, we create a model that produces informative voxelwise prior distribution based on synthetic training data. Contrary to prior work, we model the joint distribution of OEF and DBV through a scaled multivariate logit-Normal distribution, which enables the values to be constrained within a plausible range. The prior distribution model is used to train an efficient amortized variational Bayesian inference model. This model learns to infer OEF and DBV by predicting real image data, with few training data required, using the signal equations as a forward model. We demonstrate that our approach enables the inference of smooth OEF and DBV maps, with a physiologically plausible distribution that can be adapted through specification of an informative prior distribution. Other benefits include model comparison (via the evidence lower bound) and uncertainty quantification for identifying image artefacts. Results are demonstrated on a small study comparing subjects undergoing hyperventilation and at rest. We illustrate that the proposed approach allows measurement of gray matter differences in OEF and DBV and enables voxelwise comparison between conditions, where we observe significant increases in OEF and $R_2^\prime$ during hyperventilation.

8.1IVOct 26, 2022

Compressed Sensing MRI Reconstruction Regularized by VAEs with Structured Image Covariance

Margaret Duff, Ivor J. A. Simpson, Matthias J. Ehrhardt et al.

Objective: This paper investigates how generative models, trained on ground-truth images, can be used \changes{as} priors for inverse problems, penalizing reconstructions far from images the generator can produce. The aim is that learned regularization will provide complex data-driven priors to inverse problems while still retaining the control and insight of a variational regularization method. Moreover, unsupervised learning, without paired training data, allows the learned regularizer to remain flexible to changes in the forward problem such as noise level, sampling pattern or coil sensitivities in MRI. Approach: We utilize variational autoencoders (VAEs) that generate not only an image but also a covariance uncertainty matrix for each image. The covariance can model changing uncertainty dependencies caused by structure in the image, such as edges or objects, and provides a new distance metric from the manifold of learned images. Main results: We evaluate these novel generative regularizers on retrospectively sub-sampled real-valued MRI measurements from the fastMRI dataset. We compare our proposed learned regularization against other unlearned regularization approaches and unsupervised and supervised deep learning methods. Significance: Our results show that the proposed method is competitive with other state-of-the-art methods and behaves consistently with changing sampling patterns and noise levels.

8.1CVMar 29, 2022

Learning Structured Gaussians to Approximate Deep Ensembles

Ivor J. A. Simpson, Sara Vicente, Neill D. F. Campbell

This paper proposes using a sparse-structured multivariate Gaussian to provide a closed-form approximator for the output of probabilistic ensemble models used for dense image prediction tasks. This is achieved through a convolutional neural network that predicts the mean and covariance of the distribution, where the inverse covariance is parameterised by a sparsely structured Cholesky matrix. Similarly to distillation approaches, our single network is trained to maximise the probability of samples from pre-trained probabilistic models, in this work we use a fixed ensemble of networks. Once trained, our compact representation can be used to efficiently draw spatially correlated samples from the approximated output distribution. Importantly, this approach captures the uncertainty and structured correlations in the predictions explicitly in a formal distribution, rather than implicitly through sampling alone. This allows direct introspection of the model, enabling visualisation of the learned structure. Moreover, this formulation provides two further benefits: estimation of a sample probability, and the introduction of arbitrary spatial conditioning at test time. We demonstrate the merits of our approach on monocular depth estimation and show that the advantages of our approach are obtained with comparable quantitative performance.

6.2IVMar 18

Structured SIR: Efficient and Expressive Importance-Weighted Inference for High-Dimensional Image Registration

Ivor J. A. Simpson, Neill D. F. Campbell

Image registration is an ill-posed dense vision task, where multiple solutions achieve similar loss values, motivating probabilistic inference. Variational inference has previously been employed to capture these distributions, however restrictive assumptions about the posterior form can lead to poor characterisation, overconfidence and low-quality samples. More flexible posteriors are typically bottlenecked by the complexity of high-dimensional covariance matrices required for dense 3D image registration. In this work, we present a memory and computationally efficient inference method, Structured SIR, that enables expressive, multi-modal, characterisation of uncertainty with high quality samples. We propose the use of a Sampled Importance Resampling (SIR) algorithm with a novel memory-efficient high-dimensional covariance parameterisation as the sum of a low-rank covariance and a sparse, spatially structured Cholesky precision factor. This structure enables capturing complex spatial correlations while remaining computationally tractable. We evaluate the efficacy of this approach in 3D dense image registration of brain MRI data, which is a very high-dimensional problem. We demonstrate that our proposed methods produces uncertainty estimates that are significantly better calibrated than those produced by variational methods, achieving equivalent or better accuracy. Crucially, we show that the model yields highly structured multi-modal posterior distributions, enable effective and efficient uncertainty quantification.

3.6CVDec 3, 2025

Structured Uncertainty Similarity Score (SUSS): Learning a Probabilistic, Interpretable, Perceptual Metric Between Images

Paula Seidler, Neill D. F. Campbell, Ivor J A Simpson

Perceptual similarity scores that align with human vision are critical for both training and evaluating computer vision models. Deep perceptual losses, such as LPIPS, achieve good alignment but rely on complex, highly non-linear discriminative features with unknown invariances, while hand-crafted measures like SSIM are interpretable but miss key perceptual properties. We introduce the Structured Uncertainty Similarity Score (SUSS); it models each image through a set of perceptual components, each represented by a structured multivariate Normal distribution. These are trained in a generative, self-supervised manner to assign high likelihood to human-imperceptible augmentations. The final score is a weighted sum of component log-probabilities with weights learned from human perceptual datasets. Unlike feature-based methods, SUSS learns image-specific linear transformations of residuals in pixel space, enabling transparent inspection through decorrelated residuals and sampling. SUSS aligns closely with human perceptual judgments, shows strong perceptual calibration across diverse distortion types, and provides localized, interpretable explanations of its similarity assessments. We further demonstrate stable optimization behavior and competitive performance when using SUSS as a perceptual loss for downstream imaging tasks.

8.4CVApr 13, 2025

Capturing Longitudinal Changes in Brain Morphology Using Temporally Parameterized Neural Displacement Fields

Aisha L. Shuaibu, Kieran A. Gibb, Peter A. Wijeratne et al.

Longitudinal image registration enables studying temporal changes in brain morphology which is useful in applications where monitoring the growth or atrophy of specific structures is important. However this task is challenging due to; noise/artifacts in the data and quantifying small anatomical changes between sequential scans. We propose a novel longitudinal registration method that models structural changes using temporally parameterized neural displacement fields. Specifically, we implement an implicit neural representation (INR) using a multi-layer perceptron that serves as a continuous coordinate-based approximation of the deformation field at any time point. In effect, for any N scans of a particular subject, our model takes as input a 3D spatial coordinate location x, y, z and a corresponding temporal representation t and learns to describe the continuous morphology of structures for both observed and unobserved points in time. Furthermore, we leverage the analytic derivatives of the INR to derive a new regularization function that enforces monotonic rate of change in the trajectory of the voxels, which is shown to provide more biologically plausible patterns. We demonstrate the effectiveness of our method on 4D brain MR registration.

3.7CVMar 4, 2024Code

HyperPredict: Estimating Hyperparameter Effects for Instance-Specific Regularization in Deformable Image Registration

Aisha L. Shuaibu, Ivor J. A. Simpson

Methods for medical image registration infer geometric transformations that align pairs/groups of images by maximising an image similarity metric. This problem is ill-posed as several solutions may have equivalent likelihoods, also optimising purely for image similarity can yield implausible transformations. For these reasons regularization terms are essential to obtain meaningful registration results. However, this requires the introduction of at least one hyperparameter often termed $λ$, that serves as a tradeoff between loss terms. In some situations, the quality of the estimated transformation greatly depends on hyperparameter choice, and different choices may be required depending on the characteristics of the data. Analyzing the effect of these hyperparameters requires labelled data, which is not commonly available at test-time. In this paper, we propose a method for evaluating the influence of hyperparameters and subsequently selecting an optimal value for given image pairs. Our approach which we call HyperPredict, implements a Multi-Layer Perceptron that learns the effect of selecting particular hyperparameters for registering an image pair by predicting the resulting segmentation overlap and measure of deformation smoothness. This approach enables us to select optimal hyperparameters at test time without requiring labelled data, removing the need for a one-size-fits-all cross-validation approach. Furthermore, the criteria used to define optimal hyperparameter is flexible post-training, allowing us to efficiently choose specific properties. We evaluate our proposed method on the OASIS brain MR dataset using a recent deep learning approach(cLapIRN) and an algorithmic method(Niftyreg). Our results demonstrate good performance in predicting the effects of regularization hyperparameters and highlight the benefits of our image-pair specific approach to hyperparameter selection.

5.1IVSep 18, 2025

Learning Mechanistic Subtypes of Neurodegeneration with a Physics-Informed Variational Autoencoder Mixture Model

Sanduni Pinnawala, Annabelle Hartanto, Ivor J. A. Simpson et al.

Modelling the underlying mechanisms of neurodegenerative diseases demands methods that capture heterogeneous and spatially varying dynamics from sparse, high-dimensional neuroimaging data. Integrating partial differential equation (PDE) based physics knowledge with machine learning provides enhanced interpretability and utility over classic numerical methods. However, current physics-integrated machine learning methods are limited to considering a single PDE, severely limiting their application to diseases where multiple mechanisms are responsible for different groups (i.e., subtypes) and aggravating problems with model misspecification and degeneracy. Here, we present a deep generative model for learning mixtures of latent dynamic models governed by physics-based PDEs, going beyond traditional approaches that assume a single PDE structure. Our method integrates reaction-diffusion PDEs within a variational autoencoder (VAE) mixture model framework, supporting inference of subtypes of interpretable latent variables (e.g. diffusivity and reaction rates) from neuroimaging data. We evaluate our method on synthetic benchmarks and demonstrate its potential for uncovering mechanistic subtypes of Alzheimer's disease progression from positron emission tomography (PET) data.

3.6CVApr 14, 2025

Investigating the Role of Bilateral Symmetry for Inpainting Brain MRI

Sergey Kuznetsov, Sanduni Pinnawala, Peter A. Wijeratne et al.

Inpainting has recently emerged as a valuable and interesting technology to employ in the analysis of medical imaging data, in particular brain MRI. A wide variety of methodologies for inpainting MRI have been proposed and demonstrated on tasks including anomaly detection. In this work we investigate the statistical relationship between inpainted brain structures and the amount of subject-specific conditioning information, i.e. the other areas of the image that are masked. In particular, we analyse the distribution of inpainting results when masking additional regions of the image, specifically the contra-lateral structure. This allows us to elucidate where in the brain the model is drawing information from, and in particular, what is the importance of hemispherical symmetry? Our experiments interrogate a diffusion inpainting model through analysing the inpainting of subcortical brain structures based on intensity and estimated area change. We demonstrate that some structures show a strong influence of symmetry in the conditioning of the inpainting process.

5.1IVMar 13, 2025

Evaluating structural uncertainty in accelerated MRI: are voxelwise measures useful surrogates?

Luca L. C. Trautmann, Peter A. Wijeratne, Itamar Ronen et al.

Introducing accelerated reconstruction algorithms into clinical settings requires measures of uncertainty quantification that accurately assess the relevant uncertainty introduced by the reconstruction algorithm. Many currently deployed approaches quantifying uncertainty by focusing on measuring the variability in voxelwise intensity variation. Although these provide interpretable maps, they lack a structural interpretation and do not show a clear relationship to how the data will be analysed subsequently. In this work we show that voxel level uncertainty does not provide insight into morphological uncertainty. To do so, we use segmentation as a clinically-relevant downstream task and deploy ensembles of reconstruction modes to measure uncertainty in the reconstructions. We show that variability and bias in the morphological structures are present and within-ensemble variability cannot be predicted well with uncertainty measured only by voxel intensity variations.

16.7AIFeb 11, 2022Code

MusIAC: An extensible generative framework for Music Infilling Applications with multi-level Control

Rui Guo, Ivor Simpson, Chris Kiefer et al.

We present a novel music generation framework for music infilling, with a user friendly interface. Infilling refers to the task of generating musical sections given the surrounding multi-track music. The proposed transformer-based framework is extensible for new control tokens as the added music control tokens such as tonal tension per bar and track polyphony level in this work. We explore the effects of including several musically meaningful control tokens, and evaluate the results using objective metrics related to pitch and rhythm. Our results demonstrate that adding additional control tokens helps to generate music with stronger stylistic similarities to the original music. It also provides the user with more control to change properties like the music texture and tonal tension in each bar compared to previous research which only provided control for track density. We present the model in a Google Colab notebook to enable interactive generation.

11.7SDOct 13, 2020Code

A variational autoencoder for music generation controlled by tonal tension

Rui Guo, Ivor Simpson, Thor Magnusson et al.

Many of the music generation systems based on neural networks are fully autonomous and do not offer control over the generation process. In this research, we present a controllable music generation system in terms of tonal tension. We incorporate two tonal tension measures based on the Spiral Array Tension theory into a variational autoencoder model. This allows us to control the direction of the tonal tension throughout the generated piece, as well as the overall level of tonal tension. Given a seed musical fragment, stemming from either the user input or from directly sampling from the latent space, the model can generate variations of this original seed fragment with altered tonal tension. This altered music still resembles the seed music rhythmically, but the pitch of the notes are changed to match the desired tonal tension as conditioned by the user.

7.8CVNov 30, 2018

The GAN that Warped: Semantic Attribute Editing with Unpaired Data

Garoe Dorta, Sara Vicente, Neill D. F. Campbell et al.

Deep neural networks have recently been used to edit images with great success, in particular for faces. However, they are often limited to only being able to work at a restricted range of resolutions. Many methods are so flexible that face edits can often result in an unwanted loss of identity. This work proposes to learn how to perform semantic image edits through the application of smooth warp fields. Previous approaches that attempted to use warping for semantic edits required paired data, i.e. example images of the same subject with different semantic attributes. In contrast, we employ recent advances in Generative Adversarial Networks that allow our model to be trained with unpaired data. We demonstrate face editing at very high resolutions (4k images) with a single forward pass of a deep network at a lower resolution. We also show that our edits are substantially better at preserving the subject's identity. The robustness of our approach is demonstrated by showing plausible image editing results on the Cub200 birds dataset. To our knowledge this has not been previously accomplished, due the challenging nature of the dataset.

6.7MLApr 3, 2018

Training VAEs Under Structured Residuals

Garoe Dorta, Sara Vicente, Lourdes Agapito et al.

Variational auto-encoders (VAEs) are a popular and powerful deep generative model. Previous works on VAEs have assumed a factorized likelihood model, whereby the output uncertainty of each pixel is assumed to be independent. This approximation is clearly limited as demonstrated by observing a residual image from a VAE reconstruction, which often possess a high level of structure. This paper demonstrates a novel scheme to incorporate a structured Gaussian likelihood prediction network within the VAE that allows the residual correlations to be modeled. Our novel architecture, with minimal increase in complexity, incorporates the covariance matrix prediction within the VAE. We also propose a new mechanism for allowing structured uncertainty on color images. Furthermore, we provide a scheme for effectively training this model, and include some suggestions for improving performance in terms of efficiency or modeling longer range correlations.

18.2MLFeb 20, 2018

Structured Uncertainty Prediction Networks

Garoe Dorta, Sara Vicente, Lourdes Agapito et al.

This paper is the first work to propose a network to predict a structured uncertainty distribution for a synthesized image. Previous approaches have been mostly limited to predicting diagonal covariance matrices. Our novel model learns to predict a full Gaussian covariance matrix for each reconstruction, which permits efficient sampling and likelihood evaluation. We demonstrate that our model can accurately reconstruct ground truth correlated residual distributions for synthetic datasets and generate plausible high frequency samples for real face images. We also illustrate the use of these predicted covariances for structure preserving image denoising.