CVAILGOct 1, 2025

Align Your Tangent: Training Better Consistency Models via Manifold-Aligned Tangents

arXiv:2510.00658v1h-index: 12Has Code
Originality Incremental advance
AI Analysis

This addresses a bottleneck in training efficiency for fast generative models, offering a significant improvement but is incremental as it builds on existing CM frameworks.

The paper tackles the problem of prolonged training times and large batch sizes required for Consistency Models (CMs) to achieve competitive sample quality in fast inference, proposing a new loss function called manifold feature distance (MFD) that accelerates CM training by orders of magnitude and even outperforms LPIPS metrics.

With diffusion and flow matching models achieving state-of-the-art generating performance, the interest of the community now turned to reducing the inference time without sacrificing sample quality. Consistency Models (CMs), which are trained to be consistent on diffusion or probability flow ordinary differential equation (PF-ODE) trajectories, enable one or two-step flow or diffusion sampling. However, CMs typically require prolonged training with large batch sizes to obtain competitive sample quality. In this paper, we examine the training dynamics of CMs near convergence and discover that CM tangents -- CM output update directions -- are quite oscillatory, in the sense that they move parallel to the data manifold, not towards the manifold. To mitigate oscillatory tangents, we propose a new loss function, called the manifold feature distance (MFD), which provides manifold-aligned tangents that point toward the data manifold. Consequently, our method -- dubbed Align Your Tangent (AYT) -- can accelerate CM training by orders of magnitude and even out-perform the learned perceptual image patch similarity metric (LPIPS). Furthermore, we find that our loss enables training with extremely small batch sizes without compromising sample quality. Code: https://github.com/1202kbs/AYT

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes