SAGE: Semantic-Aware Shared Sampling for Efficient Diffusion
This addresses the efficiency bottleneck in diffusion models for generative AI applications, representing a novel approach rather than incremental improvement.
The paper tackles the high sampling cost of diffusion models by proposing SAGE, a semantic-aware shared sampling framework that reduces sampling steps by sharing early-stage sampling across similar queries. Experiments show SAGE reduces sampling cost by 25.5% while improving generation quality with 5.0% lower FID, 5.4% higher CLIP score, and 160% higher diversity.
Diffusion models manifest evident benefits across diverse domains, yet their high sampling cost, requiring dozens of sequential model evaluations, remains a major limitation. Prior efforts mainly accelerate sampling via optimized solvers or distillation, which treat each query independently. In contrast, we reduce total number of steps by sharing early-stage sampling across semantically similar queries. To enable such efficiency gains without sacrificing quality, we propose SAGE, a semantic-aware shared sampling framework that integrates a shared sampling scheme for efficiency and a tailored training strategy for quality preservation. Extensive experiments show that SAGE reduces sampling cost by 25.5%, while improving generation quality with 5.0% lower FID, 5.4% higher CLIP, and 160% higher diversity over baselines.