Stephen Y. Zhang

h-index9

4papers

13citations

Novelty54%

AI Score38

Ranked #107,910 of 201,018 authors (top 54%)#24,037 in LG (top 57%)

4 Papers

LGSep 29, 2025

Flow Matching with Semidiscrete Couplings

Alireza Mousavi-Hosseini, Stephen Y. Zhang, Michal Klein et al.

Flow models parameterized as time-dependent velocity fields can generate data from noise by integrating an ODE. These models are often trained using flow matching, i.e. by sampling random pairs of noise and target points $(\mathbf{x}_0,\mathbf{x}_1)$ and ensuring that the velocity field is aligned, on average, with $\mathbf{x}_1-\mathbf{x}_0$ when evaluated along a segment linking $\mathbf{x}_0$ to $\mathbf{x}_1$. While these pairs are sampled independently by default, they can also be selected more carefully by matching batches of $n$ noise to $n$ target points using an optimal transport (OT) solver. Although promising in theory, the OT flow matching (OT-FM) approach is not widely used in practice. Zhang et al. (2025) pointed out recently that OT-FM truly starts paying off when the batch size $n$ grows significantly, which only a multi-GPU implementation of the Sinkhorn algorithm can handle. Unfortunately, the costs of running Sinkhorn can quickly balloon, requiring $O(n^2/\varepsilon^2)$ operations for every $n$ pairs used to fit the velocity field, where $\varepsilon$ is a regularization parameter that should be typically small to yield better results. To fulfill the theoretical promises of OT-FM, we propose to move away from batch-OT and rely instead on a semidiscrete formulation that leverages the fact that the target dataset distribution is usually of finite size $N$. The SD-OT problem is solved by estimating a dual potential vector using SGD; using that vector, freshly sampled noise vectors at train time can then be matched with data points at the cost of a maximum inner product search (MIPS). Semidiscrete FM (SD-FM) removes the quadratic dependency on $n/\varepsilon$ that bottlenecks OT-FM. SD-FM beats both FM and OT-FM on all training metrics and inference budget constraints, across multiple datasets, on unconditional/conditional generation, or when using mean-flow models.

MLMay 22, 2025

Learning non-equilibrium diffusions with Schrödinger bridges: from exactly solvable to simulation-free

Stephen Y. Zhang, Michael P H Stumpf

We consider the Schrödinger bridge problem which, given ensemble measurements of the initial and final configurations of a stochastic dynamical system and some prior knowledge on the dynamics, aims to reconstruct the "most likely" evolution of the system compatible with the data. Most existing literature assume Brownian reference dynamics and are implicitly limited to potential-driven dynamics. We depart from this regime and consider reference processes described by a multivariate Ornstein-Uhlenbeck process with generic drift matrix $\mathbf{A} \in \mathbb{R}^{d \times d}$. When $\mathbf{A}$ is asymmetric, this corresponds to a non-equilibrium system with non-conservative forces at play: this is important for applications to biological systems, which are naturally exist out-of-equilibrium. In the case of Gaussian marginals, we derive explicit expressions that characterise the solution of both the static and dynamic Schrödinger bridge. For general marginals, we propose mvOU-OTFM, a simulation-free algorithm based on flow and score matching for learning the Schrödinger bridge. In application to a range of problems based on synthetic and real single cell data, we demonstrate that mvOU-OTFM achieves higher accuracy compared to competing methods, whilst being significantly faster to train.

LGOct 18, 2025

Simulation-free Structure Learning for Stochastic Dynamics

Noah El Rimawi-Fine, Adam Stecklov, Lucas Nelson et al.

Modeling dynamical systems and unraveling their underlying causal relationships is central to many domains in the natural sciences. Various physical systems, such as those arising in cell biology, are inherently high-dimensional and stochastic in nature, and admit only partial, noisy state measurements. This poses a significant challenge for addressing the problems of modeling the underlying dynamics and inferring the network structure of these systems. Existing methods are typically tailored either for structure learning or modeling dynamics at the population level, but are limited in their ability to address both problems together. In this work, we address both problems simultaneously: we present StructureFlow, a novel and principled simulation-free approach for jointly learning the structure and stochastic population dynamics of physical systems. We showcase the utility of StructureFlow for the tasks of structure learning from interventions and dynamical (trajectory) inference of conditional population dynamics. We empirically evaluate our approach on high-dimensional synthetic systems, a set of biologically plausible simulated systems, and an experimental single-cell dataset. We show that StructureFlow can learn the structure of underlying systems while simultaneously modeling their conditional population dynamics -- a key step toward the mechanistic understanding of systems behavior.

MLApr 4, 2021

A unified framework for non-negative matrix and tensor factorisations with a smoothed Wasserstein loss

Stephen Y. Zhang

Non-negative matrix and tensor factorisations are a classical tool for finding low-dimensional representations of high-dimensional datasets. In applications such as imaging, datasets can be regarded as distributions supported on a space with metric structure. In such a setting, a loss function based on the Wasserstein distance of optimal transportation theory is a natural choice since it incorporates the underlying geometry of the data. We introduce a general mathematical framework for computing non-negative factorisations of both matrices and tensors with respect to an optimal transport loss. We derive an efficient computational method for its solution using a convex dual formulation, and demonstrate the applicability of this approach with several numerical illustrations with both matrix and tensor-valued data.