CVAILGMay 7, 2025

Efficient Flow Matching using Latent Variables

arXiv:2505.04486v36 citationsh-index: 3
Originality Incremental advance
AI Analysis

This addresses the problem of inefficient training for high-dimensional data in generative modeling, offering a domain-specific improvement that is incremental but practical.

The paper tackles inefficient learning in flow matching models by introducing Latent-CFM, which conditions on features from pretrained latent variable models to leverage data clustering, resulting in improved generation quality with significantly less training and computation on image and physical datasets.

Flow matching models have shown great potential in image generation tasks among probabilistic generative models. However, most flow matching models in the literature do not explicitly utilize the underlying clustering structure in the target data when learning the flow from a simple source distribution like the standard Gaussian. This leads to inefficient learning, especially for many high-dimensional real-world datasets, which often reside in a low-dimensional manifold. To this end, we present $\texttt{Latent-CFM}$, which provides efficient training strategies by conditioning on the features extracted from data using pretrained deep latent variable models. Through experiments on synthetic data from multi-modal distributions and widely used image benchmark datasets, we show that $\texttt{Latent-CFM}$ exhibits improved generation quality with significantly less training and computation than state-of-the-art flow matching models by adopting pretrained lightweight latent variable models. Beyond natural images, we consider generative modeling of spatial fields stemming from physical processes. Using a 2d Darcy flow dataset, we demonstrate that our approach generates more physically accurate samples than competing approaches. In addition, through latent space analysis, we demonstrate that our approach can be used for conditional image generation conditioned on latent features, which adds interpretability to the generation process.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes