CVMar 30, 2025

FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning

arXiv:2503.23367v328 citationsh-index: 14Has Code
Originality Incremental advance
AI Analysis

This work addresses efficiency bottlenecks in VAR models for high-resolution image generation, offering a practical solution for faster inference with minimal accuracy loss.

The paper tackles the high computational cost of Visual Autoregressive (VAR) modeling at large image resolutions by proposing FastVAR, a post-training acceleration method that uses cached token pruning to reduce forwarded tokens, achieving a 2.7x speedup with less than 1% performance drop.

Visual Autoregressive (VAR) modeling has gained popularity for its shift towards next-scale prediction. However, existing VAR paradigms process the entire token map at each scale step, leading to the complexity and runtime scaling dramatically with image resolution. To address this challenge, we propose FastVAR, a post-training acceleration method for efficient resolution scaling with VARs. Our key finding is that the majority of latency arises from the large-scale step where most tokens have already converged. Leveraging this observation, we develop the cached token pruning strategy that only forwards pivotal tokens for scale-specific modeling while using cached tokens from previous scale steps to restore the pruned slots. This significantly reduces the number of forwarded tokens and improves the efficiency at larger resolutions. Experiments show the proposed FastVAR can further speedup FlashAttention-accelerated VAR by 2.7$\times$ with negligible performance drop of <1%. We further extend FastVAR to zero-shot generation of higher resolution images. In particular, FastVAR can generate one 2K image with 15GB memory footprints in 1.5s on a single NVIDIA 3090 GPU. Code is available at https://github.com/csguoh/FastVAR.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes