LGAIApr 16, 2024

SparseDM: Toward Sparse Efficient Diffusion Models

arXiv:2404.10445v417 citationsh-index: 31ICME
AI Analysis

This work addresses the deployment challenges of diffusion models for applications on devices with limited resources, representing an incremental improvement in efficiency.

The paper tackles the problem of inefficient deployment and inference of diffusion models on resource-constrained devices by introducing a method that adds sparse masks to pre-trained models, reducing MACs by 50% while maintaining FID and achieving a 1.2x GPU acceleration.

Diffusion models represent a powerful family of generative models widely used for image and video generation. However, the time-consuming deployment, long inference time, and requirements on large memory hinder their applications on resource constrained devices. In this paper, we propose a method based on the improved Straight-Through Estimator to improve the deployment efficiency of diffusion models. Specifically, we add sparse masks to the Convolution and Linear layers in a pre-trained diffusion model, then transfer learn the sparse model during the fine-tuning stage and turn on the sparse masks during inference. Experimental results on a Transformer and UNet-based diffusion models demonstrate that our method reduces MACs by 50% while maintaining FID. Sparse models are accelerated by approximately 1.2x on the GPU. Under other MACs conditions, the FID is also lower than 1 compared to other methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes