Stabilizing Consistency Training: A Flow Map Analysis and Self-Distillation
This work addresses stability issues in consistency models for generative modeling, offering theoretical insights and a practical remedy that is incremental but extends applicability.
The paper tackles the instability and limited reproducibility of consistency models in generative modeling by providing a theoretical flow map analysis that clarifies how training stability and convergence lead to degenerate solutions, and proposes a reformulated self-distillation strategy that stabilizes optimization and extends to diffusion-based policy learning without pretrained initialization.
Consistency models have been proposed for fast generative modeling, achieving results competitive with diffusion and flow models. However, these methods exhibit inherent instability and limited reproducibility when training from scratch, motivating subsequent work to explain and stabilize these issues. While these efforts have provided valuable insights, the explanations remain fragmented, and the theoretical relationships remain unclear. In this work, we provide a theoretical examination of consistency models by analyzing them from a flow map-based perspective. This joint analysis clarifies how training stability and convergence behavior can give rise to degenerate solutions. Building on these insights, we revisit self-distillation as a practical remedy for certain forms of suboptimal convergence and reformulate it to avoid excessive gradient norms for stable optimization. We further demonstrate that our strategy extends beyond image generation to diffusion-based policy learning, without reliance on a pretrained diffusion model for initialization, thereby illustrating its broader applicability.