Equitable modelling of brain imaging by counterfactual augmentation with morphologically constrained 3D deep generative models
This addresses data imbalance and inequitable performance in brain imaging models for medical research, representing a novel method for a known bottleneck.
The paper tackles the problem of data imbalance and inequitable performance in brain imaging models by developing Countersynth, a generative model for synthesizing counterfactual training data, achieving state-of-the-art improvements in fidelity and equity on UK Biobank MRI data.
We describe Countersynth, a conditional generative model of diffeomorphic deformations that induce label-driven, biologically plausible changes in volumetric brain images. The model is intended to synthesise counterfactual training data augmentations for downstream discriminative modelling tasks where fidelity is limited by data imbalance, distributional instability, confounding, or underspecification, and exhibits inequitable performance across distinct subpopulations. Focusing on demographic attributes, we evaluate the quality of synthesized counterfactuals with voxel-based morphometry, classification and regression of the conditioning attributes, and the Fréchet inception distance. Examining downstream discriminative performance in the context of engineered demographic imbalance and confounding, we use UK Biobank magnetic resonance imaging data to benchmark CounterSynth augmentation against current solutions to these problems. We achieve state-of-the-art improvements, both in overall fidelity and equity. The source code for CounterSynth is available online.