91.2LGMay 31
Strong Stochastic Flow MapsSam McCallum, Zander W. Blasingame, Timothy Herschell et al.
Flow and diffusion models generate high-quality samples in many modalities; however, many network evaluations are required during inference due to numerical integration of an underlying differential equation. Flow maps alleviate this problem by learning the solution map of the differential equation directly, enabling few-step sampling. Yet, current methods are restricted to approximating the solution map of ODEs. These methods can be used to learn the transition kernel of an SDE, thereby obtaining a solution map that recovers the marginal distributions of the process (weak convergence) rather than the solution path (strong convergence). We propose Strong Stochastic Flow Maps (SSFMs) as a novel framework for learning the strong solution map of additive-noise SDEs, directly generalizing deterministic flow maps to the stochastic setting. Further, a polynomial approximation to Brownian motion is introduced and shown to converge pathwise. These results enable a simulation-free training objective for the solution map of diffusion models. We demonstrate that SSFMs outperform previous stochastic flow map methods on image generation and enable few-step sampling of molecular systems.
LGOct 15, 2024
Efficient, Accurate and Stable Gradients for Neural ODEsSam McCallum, James Foster
Training Neural ODEs requires backpropagating through an ODE solve. The state-of-the-art backpropagation method is recursive checkpointing that balances recomputation with memory cost. Here, we introduce a class of algebraically reversible ODE solvers that significantly improve upon both the time and memory cost of recursive checkpointing. The reversible solvers presented calculate exact gradients, are high-order and numerically stable -- strictly improving on previous reversible architectures.
LGSep 16, 2025
Reversible Deep Equilibrium ModelsSam McCallum, Kamran Arora, James Foster
Deep Equilibrium Models (DEQs) are an interesting class of implicit model where the model output is implicitly defined as the fixed point of a learned function. These models have been shown to outperform explicit (fixed-depth) models in large-scale tasks by trading many deep layers for a single layer that is iterated many times. However, gradient calculation through DEQs is approximate. This often leads to unstable training dynamics and requires regularisation or many function evaluations to fix. Here, we introduce Reversible Deep Equilibrium Models (RevDEQs) that allow for exact gradient calculation, no regularisation and far fewer function evaluations than DEQs. We show that RevDEQs achieve state-of-the-art performance on language modelling and image classification tasks against comparable implicit and explicit models.