LGMar 2

Spectral Regularization for Diffusion Models

Satish Chandran, Nicolas Roque dos Santos, Yunshu Wu, Greg Ver Steeg, Evangelos Papalexakis

arXiv:2603.02447v12.71 citationsh-index: 6

Originality Incremental advance

AI Analysis

This work addresses sample quality issues in diffusion models for image and audio generation, but it is incremental as it adds regularization without changing core methods.

The authors tackled the problem of diffusion models being agnostic to spectral and multi-scale structure in natural signals by proposing a spectral regularization framework that augments standard training with Fourier- and wavelet-domain losses, resulting in consistent improvements in sample quality, especially on higher-resolution datasets.

Diffusion models are typically trained using pointwise reconstruction objectives that are agnostic to the spectral and multi-scale structure of natural signals. We propose a loss-level spectral regularization framework that augments standard diffusion training with differentiable Fourier- and wavelet-domain losses, without modifying the diffusion process, model architecture, or sampling procedure. The proposed regularizers act as soft inductive biases that encourage appropriate frequency balance and coherent multi-scale structure in generated samples. Our approach is compatible with DDPM, DDIM, and EDM formulations and introduces negligible computational overhead. Experiments on image and audio generation demonstrate consistent improvements in sample quality, with the largest gains observed on higher-resolution, unconditional datasets where fine-scale structure is most challenging to model.

View on arXiv PDF

Similar