CVAIMar 18, 2025

DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers

arXiv:2503.14487v117 citationsh-index: 22
Originality Incremental advance
AI Analysis

This addresses scalability and efficiency issues in diffusion models for image generation tasks, offering broad applicability but is incremental as it builds on existing MoE approaches.

The paper tackles the limitation of uniform processing in diffusion models by proposing DiffMoE, which uses a global token pool and dynamic resource allocation to achieve state-of-the-art performance on ImageNet, outperforming dense architectures with 3x activated parameters while maintaining 1x activated parameters.

Diffusion models have demonstrated remarkable success in various image generation tasks, but their performance is often limited by the uniform processing of inputs across varying conditions and noise levels. To address this limitation, we propose a novel approach that leverages the inherent heterogeneity of the diffusion process. Our method, DiffMoE, introduces a batch-level global token pool that enables experts to access global token distributions during training, promoting specialized expert behavior. To unleash the full potential of the diffusion process, DiffMoE incorporates a capacity predictor that dynamically allocates computational resources based on noise levels and sample complexity. Through comprehensive evaluation, DiffMoE achieves state-of-the-art performance among diffusion models on ImageNet benchmark, substantially outperforming both dense architectures with 3x activated parameters and existing MoE approaches while maintaining 1x activated parameters. The effectiveness of our approach extends beyond class-conditional generation to more challenging tasks such as text-to-image generation, demonstrating its broad applicability across different diffusion model applications. Project Page: https://shiml20.github.io/DiffMoE/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes