Expected Batch Optimal Transport Plans and Consequences for Flow Matching
For practitioners of flow matching and optimal transport, this provides theoretical grounding for minibatch OT and practical guidance on batch size selection.
The paper formalizes the expected batch optimal transport plan, showing it converges to the true OT plan with large batch sizes and deriving convergence rates for semidiscrete settings. In flow matching, this yields a regular velocity field and quantifies how batch size affects numerical integration.
Solving optimal transport (OT) on random minibatches is a common surrogate for exact OT in large-scale learning. In flow matching (FM), this surrogate is used to obtain OT-like couplings that can straighten probability paths and reduce numerical integration cost. Yet, the population-level coupling induced by repeated minibatch OT remains only partially understood. We formalize this coupling as the expected batch OT plan $\overlineπ_{k}$, obtained by averaging empirical OT plans over independent minibatches of size $k$. We then establish its large-batch consistency and, in the semidiscrete case relevant to generative modeling, derive rates for both the transport-cost bias and the convergence of $\overlineπ_{k}$ to the OT plan. For FM, this yields a population coupling whose induced velocity field is regular enough to define a unique flow from the source to the discrete target. We finally quantify how OT batch size interacts with numerical integration in a tractable two-atom model and in synthetic and image experiments.