ML LG PRDec 18, 2025

On The Hidden Biases of Flow Matching Samplers

arXiv:2512.16768v2h-index: 13

Originality Incremental advance

AI Analysis

This work addresses theoretical biases in flow matching samplers for machine learning practitioners, providing a mathematical analysis that is incremental in nature.

The paper investigates the implicit biases of flow matching (FM) samplers, showing that empirical FM minimizers are not gradient fields and thus not optimal transport-optimal, with kinetic energies of generated samples exhibiting exponential concentration for Gaussian sources and polynomial tails for heavy-tailed sources.

We study the implicit bias of flow matching (FM) samplers via the lens of empirical flow matching. Although population FM may produce gradient-field velocities resembling optimal transport (OT), we show that the empirical FM minimizer is generally not a gradient field, even when each conditional flow is. Consequently, empirical FM is intrinsically not OT-optimal in the Benamou-Brenier sense. In view of this, we analyze the kinetic energy of generated samples. With Gaussian sources, both instantaneous and integrated kinetic energies exhibit exponential concentration, while heavy-tailed sources lead to polynomial tails. These behaviors are governed primarily by the choice of source distribution rather than the data. Overall, these notes provide a concise mathematical account of the structural and energetic biases arising in empirical FM.

View on arXiv PDF

Similar