ML LGMay 13

Coreset-Induced Conditional Velocity Flow Matching

arXiv:2605.1295153.3

AI Analysis

For generative modeling practitioners, CCVFM offers a principled way to reduce the computational burden of flow matching by leveraging coresets, though the gains are incremental over existing hierarchical flow methods.

CCVFM replaces the isotropic Gaussian source in hierarchical flow matching with a Gaussian mixture built from a coreset of the target, enabling closed-form sampling of the conditional velocity law and requiring only a lightweight correction flow. It achieves competitive few-step generation on MNIST, CIFAR-10, ImageNet-32, and CelebA-HQ.

We propose Coreset-Induced Conditional Velocity Flow Matching (CCVFM), a generative model that augments hierarchical rectified flow with a data-informed source distribution. Hierarchical flow matching models the full conditional velocity law in velocity space, but its inner flow is asked to transport isotropic Gaussian noise to a multimodal target velocity distribution from scratch. Our key observation is that this inner source can be replaced by a closed-form surrogate built from a coreset of the target. CCVFM first compresses the target into weighted atoms using an entropic Sinkhorn coreset and lifts them to a Gaussian mixture. The induced conditional velocity law is then a closed-form Gaussian mixture that can be sampled without a learned neural sampler. A lightweight correction flow, trained from this exact surrogate source, then refines the remaining surrogate-to-target residual rather than learning an entire noise-to-data map. We prove that the surrogate transport cost equals the target--surrogate Wasserstein gap under an explicit compression assumption, whereas the noise-source analogue has a dimension-scale lower bound. We further characterize the conditional second moment of the direct surrogate-source training target and show that its source-dependent excess is small when the surrogate conditional law is close to the true conditional velocity law in mean and covariance. Empirically, on MNIST, CIFAR-10, ImageNet-32, and CelebA-HQ, the proposed method reaches competitive few-step generation under matched architectures.

View on arXiv PDF

Similar