CVLGMar 14, 2025

Understanding Flatness in Generative Models: Its Role and Benefits

arXiv:2503.11078v23 citationsh-index: 2
Originality Incremental advance
AI Analysis

This work addresses the problem of improving robustness and generalization in generative models for researchers and practitioners, though it is incremental by extending flat minima concepts from supervised to generative learning.

The paper investigates the role of loss surface flatness in generative models, particularly diffusion models, showing that flatter minima improve robustness against perturbations and reduce exposure bias, with experiments on datasets like CIFAR-10 and FFHQ demonstrating enhanced generative performance and resilience to quantization.

Flat minima, known to enhance generalization and robustness in supervised learning, remain largely unexplored in generative models. In this work, we systematically investigate the role of loss surface flatness in generative models, both theoretically and empirically, with a particular focus on diffusion models. We establish a theoretical claim that flatter minima improve robustness against perturbations in target prior distributions, leading to benefits such as reduced exposure bias -- where errors in noise estimation accumulate over iterations -- and significantly improved resilience to model quantization, preserving generative performance even under strong quantization constraints. We further observe that Sharpness-Aware Minimization (SAM), which explicitly controls the degree of flatness, effectively enhances flatness in diffusion models even surpassing the indirectly promoting flatness methods -- Input Perturbation (IP) which enforces the Lipschitz condition, ensembling-based approach like Stochastic Weight Averaging (SWA) and Exponential Moving Average (EMA) -- are less effective. Through extensive experiments on CIFAR-10, LSUN Tower, and FFHQ, we demonstrate that flat minima in diffusion models indeed improve not only generative performance but also robustness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes