CVFeb 26

ToProVAR: Efficient Visual Autoregressive Modeling via Tri-Dimensional Entropy-Aware Semantic Analysis and Sparsity Optimization

arXiv:2602.22948v2h-index: 3
AI Analysis

This work provides a significant efficiency improvement for Visual Autoregressive models, which are used for high-quality image generation, by accelerating the generation process for users and developers.

This paper addresses the efficiency bottleneck in later stages of Visual Autoregressive (VAR) models by proposing ToProVAR, an optimization framework that uses attention entropy to analyze semantic projections and identify sparsity patterns across token, layer, and scale dimensions. ToProVAR achieves up to 3.4x acceleration on Infinity-2B and Infinity-8B models with minimal quality loss, outperforming prior methods.

Visual Autoregressive(VAR) models enhance generation quality but face a critical efficiency bottleneck in later stages. In this paper, we present a novel optimization framework for VAR models that fundamentally differs from prior approaches such as FastVAR and SkipVAR. Instead of relying on heuristic skipping strategies, our method leverages attention entropy to characterize the semantic projections across different dimensions of the model architecture. This enables precise identification of parameter dynamics under varying token granularity levels, semantic scopes, and generation scales. Building on this analysis, we further uncover sparsity patterns along three critical dimensions-token, layer, and scale-and propose a set of fine-grained optimization strategies tailored to these patterns. Extensive evaluation demonstrates that our approach achieves aggressive acceleration of the generation process while significantly preserving semantic fidelity and fine details, outperforming traditional methods in both efficiency and quality. Experiments on Infinity-2B and Infinity-8B models demonstrate that ToProVAR achieves up to 3.4x acceleration with minimal quality loss, effectively mitigating the issues found in prior work. Our code will be made publicly available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes