Hugh Dance

ML
h-index7
5papers
21citations
Novelty62%
AI Score48

5 Papers

MLMay 8
Debiased Counterfactual Generation via Flow Matching from Observations

Hugh Dance, Johnny Xi, Peter Orbanz et al.

Estimating counterfactual distributions under interventions is central to treatment risk assessment and counterfactual generation tasks. Existing approaches model the counterfactual distribution as a standalone generative target, without exploiting its relationship to the observational data. In this work, we show that under standard assumptions, observational and counterfactual outcome distributions are tightly linked: they have identical support and tail behavior, remain statistically close under weak confounding, and share any features of high-dimensional outcomes which are invariant to confounders. These properties motivate learning counterfactual distributions not from scratch, but via a deconfounding flow from the observational distribution. We formulate this problem via flow-matching and derive a semiparametrically efficient estimator based on a novel efficient influence function correction. We subsequently extend our estimator to target minimal-energy flows in high-dimensions, which we show can be especially simple targets between observational and counterfactual distributions. In experiments, deconfounding flows outperform existing debiased counterfactual distribution estimators, while also mitigating known failure modes of flow-based methods.

MLOct 18, 2024
Interventional Processes for Causal Uncertainty Quantification

Hugh Dance, Peter Orbanz, Arthur Gretton

Reliable uncertainty quantification for causal effects is crucial in various applications, but remains difficult in nonparametric models, particularly for continuous treatments. We introduce IMPspec, a Gaussian process (GP) framework for modeling uncertainty over interventional causal functions under continuous treatments, which can be represented using reproducing Kernel Hilbert Spaces (RKHSs). By using principled function class expansions and a spectral representation of RKHS features, IMPspec yields tractable training and inference, a spectral algorithm to calibrate posterior credible intervals, and avoids the underfitting and variance collapse pathologies of earlier GP-on-RKHS methods. Across synthetic benchmarks and an application in healthcare, IMPspec delivers state-of-the-art performance in causal uncertainty quantification and downstream causal Bayesian optimization tasks.

MSMar 20, 2025
Efficiently Vectorized MCMC on Modern Accelerators

Hugh Dance, Pierre Glaser, Peter Orbanz et al.

With the advent of automatic vectorization tools (e.g., JAX's $\texttt{vmap}$), writing multi-chain MCMC algorithms is often now as simple as invoking those tools on single-chain code. Whilst convenient, for various MCMC algorithms this results in a synchronization problem -- loosely speaking, at each iteration all chains running in parallel must wait until the last chain has finished drawing its sample. In this work, we show how to design single-chain MCMC algorithms in a way that avoids synchronization overheads when vectorizing with tools like $\texttt{vmap}$ by using the framework of finite state machines (FSMs). Using a simplified model, we derive an exact theoretical form of the obtainable speed-ups using our approach, and use it to make principled recommendations for optimal algorithm design. We implement several popular MCMC algorithms as FSMs, including Elliptical Slice Sampling, HMC-NUTS, and Delayed Rejection, demonstrating speed-ups of up to an order of magnitude in experiments.

MLFeb 7, 2025
Distinguishing Cause from Effect with Causal Velocity Models

Johnny Xi, Hugh Dance, Peter Orbanz et al.

Bivariate structural causal models (SCM) are often used to infer causal direction by examining their goodness-of-fit under restricted model classes. In this paper, we describe a parametrization of bivariate SCMs in terms of a causal velocity by viewing the cause variable as time in a dynamical system. The velocity implicitly defines counterfactual curves via the solution of initial value problems where the observation specifies the initial condition. Using tools from measure transport, we obtain a unique correspondence between SCMs and the score function of the generated distribution via its causal velocity. Based on this, we derive an objective function that directly regresses the velocity against the score function, the latter of which can be estimated non-parametrically from observational data. We use this to develop a method for bivariate causal discovery that extends beyond known model classes such as additive or location scale noise, and that requires no assumptions on the noise distributions. When the score is estimated well, the objective is also useful for detecting model non-identifiability and misspecification. We present positive results in simulation and benchmark experiments where many existing methods fail, and perform ablation studies to examine the method's sensitivity to accurate score estimation.

MLNov 8, 2021
Fast and Scalable Spike and Slab Variable Selection in High-Dimensional Gaussian Processes

Hugh Dance, Brooks Paige

Variable selection in Gaussian processes (GPs) is typically undertaken by thresholding the inverse lengthscales of automatic relevance determination kernels, but in high-dimensional datasets this approach can be unreliable. A more probabilistically principled alternative is to use spike and slab priors and infer a posterior probability of variable inclusion. However, existing implementations in GPs are very costly to run in both high-dimensional and large-$n$ datasets, or are only suitable for unsupervised settings with specific kernels. As such, we develop a fast and scalable variational inference algorithm for the spike and slab GP that is tractable with arbitrary differentiable kernels. We improve our algorithm's ability to adapt to the sparsity of relevant variables by Bayesian model averaging over hyperparameters, and achieve substantial speed ups using zero temperature posterior restrictions, dropout pruning and nearest neighbour minibatching. In experiments our method consistently outperforms vanilla and sparse variational GPs whilst retaining similar runtimes (even when $n=10^6$) and performs competitively with a spike and slab GP using MCMC but runs up to $1000$ times faster.