CVOct 8, 2025

Dynamic Mixture-of-Experts for Visual Autoregressive Model

arXiv:2510.08629v11 citationsh-index: 6
Originality Incremental advance
AI Analysis

This addresses efficiency issues for users of VAR models in image generation, though it appears incremental as it builds on existing VAR and MoE methods.

The paper tackles computational redundancy in Visual Autoregressive Models (VAR) for image generation by introducing a dynamic Mixture-of-Experts router with scale-aware thresholding, achieving 20% fewer FLOPs, 11% faster inference, and matching baseline image quality.

Visual Autoregressive Models (VAR) offer efficient and high-quality image generation but suffer from computational redundancy due to repeated Transformer calls at increasing resolutions. We introduce a dynamic Mixture-of-Experts router integrated into VAR. The new architecture allows to trade compute for quality through scale-aware thresholding. This thresholding strategy balances expert selection based on token complexity and resolution, without requiring additional training. As a result, we achieve 20% fewer FLOPs, 11% faster inference and match the image quality achieved by the dense baseline.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes