CVAILGMMOct 25, 2024

Flow Generator Matching

arXiv:2410.19310v127 citationsh-index: 7
Originality Incremental advance
AI Analysis

This addresses the computational bottleneck for users of flow-matching models in AIGC, offering a significant speed-up with theoretical guarantees, though it is incremental as it builds on existing flow-matching paradigms.

This paper tackles the high computational cost of sampling from flow-matching models by introducing Flow Generator Matching (FGM), which accelerates sampling to one-step generation while maintaining performance, achieving a record FID score of 3.08 on CIFAR10 and enabling efficient text-to-image generation with Stable Diffusion 3.

In the realm of Artificial Intelligence Generated Content (AIGC), flow-matching models have emerged as a powerhouse, achieving success due to their robust theoretical underpinnings and solid ability for large-scale generative modeling. These models have demonstrated state-of-the-art performance, but their brilliance comes at a cost. The process of sampling from these models is notoriously demanding on computational resources, as it necessitates the use of multi-step numerical ordinary differential equations (ODEs). Against this backdrop, this paper presents a novel solution with theoretical guarantees in the form of Flow Generator Matching (FGM), an innovative approach designed to accelerate the sampling of flow-matching models into a one-step generation, while maintaining the original performance. On the CIFAR10 unconditional generation benchmark, our one-step FGM model achieves a new record Fréchet Inception Distance (FID) score of 3.08 among few-step flow-matching-based models, outperforming original 50-step flow-matching models. Furthermore, we use the FGM to distill the Stable Diffusion 3, a leading text-to-image flow-matching model based on the MM-DiT architecture. The resulting MM-DiT-FGM one-step text-to-image model demonstrates outstanding industry-level performance. When evaluated on the GenEval benchmark, MM-DiT-FGM has delivered remarkable generating qualities, rivaling other multi-step models in light of the efficiency of a single generation step.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes