LGNov 7, 2022
Deviations in Representations Induced by Adversarial AttacksDaniel Steinberg, Paul Munro
Deep learning has been a popular topic and has achieved success in many areas. It has drawn the attention of researchers and machine learning practitioners alike, with developed models deployed to a variety of settings. Along with its achievements, research has shown that deep learning models are vulnerable to adversarial attacks. This finding brought about a new direction in research, whereby algorithms were developed to attack and defend vulnerable networks. Our interest is in understanding how these attacks effect change on the intermediate representations of deep learning models. We present a method for measuring and analyzing the deviations in representations induced by adversarial attacks, progressively across a selected set of layers. Experiments are conducted using an assortment of attack algorithms, on the CIFAR-10 dataset, with plots created to visualize the impact of adversarial attacks across different layers in a network.
IVAug 19, 2025
A Systematic Study of Deep Learning Models and xAI Methods for Region-of-Interest Detection in MRI ScansJustin Yiu, Kushank Arora, Daniel Steinberg et al.
Magnetic Resonance Imaging (MRI) is an essential diagnostic tool for assessing knee injuries. However, manual interpretation of MRI slices remains time-consuming and prone to inter-observer variability. This study presents a systematic evaluation of various deep learning architectures combined with explainable AI (xAI) techniques for automated region of interest (ROI) detection in knee MRI scans. We investigate both supervised and self-supervised approaches, including ResNet50, InceptionV3, Vision Transformers (ViT), and multiple U-Net variants augmented with multi-layer perceptron (MLP) classifiers. To enhance interpretability and clinical relevance, we integrate xAI methods such as Grad-CAM and Saliency Maps. Model performance is assessed using AUC for classification and PSNR/SSIM for reconstruction quality, along with qualitative ROI visualizations. Our results demonstrate that ResNet50 consistently excels in classification and ROI identification, outperforming transformer-based models under the constraints of the MRNet dataset. While hybrid U-Net + MLP approaches show potential for leveraging spatial features in reconstruction and interpretability, their classification performance remains lower. Grad-CAM consistently provided the most clinically meaningful explanations across architectures. Overall, CNN-based transfer learning emerges as the most effective approach for this dataset, while future work with larger-scale pretraining may better unlock the potential of transformer models.
LGMay 29, 2023
Statistically Efficient Bayesian Sequential Experiment Design via Reinforcement Learning with Cross-Entropy EstimatorsTom Blau, Iadine Chades, Amir Dezfouli et al.
Reinforcement learning can learn amortised design policies for designing sequences of experiments. However, current amortised methods rely on estimators of expected information gain (EIG) that require an exponential number of samples on the magnitude of the EIG to achieve an unbiased estimation. We propose the use of an alternative estimator based on the cross-entropy of the joint model distribution and a flexible proposal distribution. This proposal distribution approximates the true posterior of the model parameters given the experimental history and the design policy. Our method overcomes the exponential-sample complexity of previous approaches and provide more accurate estimates of high EIG values. More importantly, it allows learning of superior design policies, and is compatible with continuous and discrete design spaces, non-differentiable likelihoods and even implicit probabilistic models.
LGNov 13, 2021
Measuring the Contribution of Multiple Model Representations in Detecting Adversarial InstancesDaniel Steinberg, Paul Munro
Deep learning models have been used for a wide variety of tasks. They are prevalent in computer vision, natural language processing, speech recognition, and other areas. While these models have worked well under many scenarios, it has been shown that they are vulnerable to adversarial attacks. This has led to a proliferation of research into ways that such attacks could be identified and/or defended against. Our goal is to explore the contribution that can be attributed to using multiple underlying models for the purpose of adversarial instance detection. Our paper describes two approaches that incorporate representations from multiple models for detecting adversarial examples. We devise controlled experiments for measuring the detection impact of incrementally utilizing additional models. For many of the scenarios we consider, the results show that performance increases with the number of underlying models used for extracting representations.
LGMay 28, 2021
Visualizing Representations of Adversarially Perturbed InputsDaniel Steinberg, Paul Munro
It has been shown that deep learning models are vulnerable to adversarial attacks. We seek to further understand the consequence of such attacks on the intermediate activations of neural networks. We present an evaluation metric, POP-N, which scores the effectiveness of projecting data to N dimensions under the context of visualizing representations of adversarially perturbed inputs. We conduct experiments on CIFAR-10 to compare the POP-2 score of several dimensionality reduction algorithms across various adversarial attacks. Finally, we utilize the 2D data corresponding to high POP-2 scores to generate example visualizations.
LGFeb 14, 2020
Fast Fair Regression via Efficient Approximations of Mutual InformationDaniel Steinberg, Alistair Reid, Simon O'Callaghan et al.
Most work in algorithmic fairness to date has focused on discrete outcomes, such as deciding whether to grant someone a loan or not. In these classification settings, group fairness criteria such as independence, separation and sufficiency can be measured directly by comparing rates of outcomes between subpopulations. Many important problems however require the prediction of a real-valued outcome, such as a risk score or insurance premium. In such regression settings, measuring group fairness criteria is computationally challenging, as it requires estimating information-theoretic divergences between conditional probability density functions. This paper introduces fast approximations of the independence, separation and sufficiency group fairness criteria for regression models from their (conditional) mutual information definitions, and uses such approximations as regularisers to enforce fairness within a regularised risk minimisation framework. Experiments in real-world datasets indicate that in spite of its superior computational efficiency our algorithm still displays state-of-the-art accuracy/fairness tradeoffs.
LGJan 16, 2020
Fairness Measures for Regression via Probabilistic ClassificationDaniel Steinberg, Alistair Reid, Simon O'Callaghan
Algorithmic fairness involves expressing notions such as equity, or reasonable treatment, as quantifiable measures that a machine learning algorithm can optimise. Most work in the literature to date has focused on classification problems where the prediction is categorical, such as accepting or rejecting a loan application. This is in part because classification fairness measures are easily computed by comparing the rates of outcomes, leading to behaviours such as ensuring that the same fraction of eligible men are selected as eligible women. But such measures are computationally difficult to generalise to the continuous regression setting for problems such as pricing, or allocating payments. The difficulty arises from estimating conditional densities (such as the probability density that a system will over-charge by a certain amount). For the regression setting we introduce tractable approximations of the independence, separation and sufficiency criteria by observing that they factorise as ratios of different conditional probabilities of the protected attributes. We introduce and train machine learning classifiers, distinct from the predictor, as a mechanism to estimate these probabilities from the data. This naturally leads to model agnostic, tractable approximations of the criteria, which we explore experimentally.