LGAIMay 2, 2025

Distilling Two-Timed Flow Models by Separately Matching Initial and Terminal Velocities

arXiv:2505.01169v21 citationsh-index: 16
Originality Incremental advance
AI Analysis

This work addresses the need for efficient generative models in machine learning, though it is incremental as it builds on existing distillation methods.

The paper tackles the problem of distilling flow matching models into efficient two-timed flow models for faster generation by proposing a new loss function called initial/terminal velocity matching (ITVM), which improves few-step generation performance across various datasets and architectures.

A flow matching model learns a time-dependent vector field $v_t(x)$ that generates a probability path $\{ p_t \}_{0 \leq t \leq 1}$ that interpolates between a well-known noise distribution ($p_0$) and the data distribution ($p_1$). It can be distilled into a two-timed flow model (TTFM) $φ_{s,x}(t)$ that can transform a sample belonging to the distribution at an initial time $s$ to another belonging to the distribution at a terminal time $t$ in one function evaluation. We present a new loss function for TTFM distillation called the \emph{initial/terminal velocity matching} (ITVM) loss that extends the Lagrangian Flow Map Distillation (LFMD) loss proposed by Boffi et al. by adding redundant terms to match the initial velocities at time $s$, removing the derivative from the terminal velocity term at time $t$, and using a version of the model under training, stabilized by exponential moving averaging (EMA), to compute the target terminal average velocity. Preliminary experiments show that our loss leads to better few-step generation performance on multiple types of datasets and model architectures over baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes