LGCVFeb 13, 2025

Variational Rectified Flow Matching

arXiv:2502.09616v127 citationsh-index: 2ICML
Originality Highly original
AI Analysis

This work addresses the problem of flow matching for computer vision and machine learning applications, providing an incremental improvement over classic rectified flow matching.

The authors tackled the problem of rectified flow matching by introducing a variational approach that models multi-modal velocity vector-fields, resulting in compelling results on synthetic data, MNIST, CIFAR-10, and ImageNet. The approach leads to improved performance, although specific numbers are not provided.

We study Variational Rectified Flow Matching, a framework that enhances classic rectified flow matching by modeling multi-modal velocity vector-fields. At inference time, classic rectified flow matching 'moves' samples from a source distribution to the target distribution by solving an ordinary differential equation via integration along a velocity vector-field. At training time, the velocity vector-field is learnt by linearly interpolating between coupled samples one drawn from the source and one drawn from the target distribution randomly. This leads to ''ground-truth'' velocity vector-fields that point in different directions at the same location, i.e., the velocity vector-fields are multi-modal/ambiguous. However, since training uses a standard mean-squared-error loss, the learnt velocity vector-field averages ''ground-truth'' directions and isn't multi-modal. In contrast, variational rectified flow matching learns and samples from multi-modal flow directions. We show on synthetic data, MNIST, CIFAR-10, and ImageNet that variational rectified flow matching leads to compelling results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes