MLLGNov 20, 2025

Time dependent loss reweighting for flow matching and diffusion models is theoretically justified

arXiv:2511.16599v13 citationsh-index: 2
Originality Synthesis-oriented
AI Analysis

This provides theoretical justification for incremental improvements in training flow and diffusion models, benefiting practitioners in machine learning.

The paper clarifies that in Generator Matching and Edit Flows, loss functions can depend on time and state, theoretically justifying time-dependent loss weighting used in practice for training stability and simplifying predictor schemes.

This brief note clarifies that, in Generator Matching (which subsumes a large family of flow matching and diffusion models over continuous, manifold, and discrete spaces), both the Bregman divergence loss and the linear parameterization of the generator can depend on both the current state $X_t$ and the time $t$, and we show that the expectation over time in the loss can be taken with respect to a broad class of time distributions. We also show this for Edit Flows, which falls outside of Generator Matching. That the loss can depend on $t$ clarifies that time-dependent loss weighting schemes, often used in practice to stabilize training, are theoretically justified when the specific flow or diffusion scheme is a special case of Generator Matching (or Edit Flows). It also often simplifies the construction of $X_1$-predictor schemes, which are sometimes preferred for model-related reasons. We show examples that rely upon the dependence of linear parameterizations, and of the Bregman divergence loss, on $t$ and $X_t$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes