Optimal Stochastic Trace Estimation in Generative Modeling
This work addresses a bottleneck in generative modeling for researchers and practitioners by improving the efficiency and quality of training diffusion models, though it is incremental as it builds on existing trace estimation methods.
The paper tackled the high variance and scalability issues of Hutchinson estimators in training divergence-based likelihoods for diffusion models by proposing Hutch++, an optimal stochastic trace estimator that minimizes training variance while maintaining transport optimality. The result showed that Hutch++ leads to higher quality generations and effective variance reduction in applications like simulations, conditional time series forecasts, and image generation.
Hutchinson estimators are widely employed in training divergence-based likelihoods for diffusion models to ensure optimal transport (OT) properties. However, this estimator often suffers from high variance and scalability concerns. To address these challenges, we investigate Hutch++, an optimal stochastic trace estimator for generative models, designed to minimize training variance while maintaining transport optimality. Hutch++ is particularly effective for handling ill-conditioned matrices with large condition numbers, which commonly arise when high-dimensional data exhibits a low-dimensional structure. To mitigate the need for frequent and costly QR decompositions, we propose practical schemes that balance frequency and accuracy, backed by theoretical guarantees. Our analysis demonstrates that Hutch++ leads to generations of higher quality. Furthermore, this method exhibits effective variance reduction in various applications, including simulations, conditional time series forecasts, and image generation.