CVAIMay 8

CASCADE: Context-Aware Relaxation for Speculative Image Decoding

arXiv:2605.0723074.8
Predicted impact top 48% in CV · last 90 daysOriginality Incremental advance
AI Analysis

This work provides a practical acceleration technique for autoregressive image generation, a bottleneck for deployment on advanced hardware.

CASCADE addresses the high draft token rejection rate in speculative decoding for autoregressive image generation by exploiting redundancies in the target model's hidden states, achieving up to 3.6x speedup without sacrificing image quality.

Autoregressive generation is a powerful approach for high-fidelity image synthesis, but it remains computationally demanding and slow even on the most advanced accelerators. While speculative decoding has been explored to mitigate this bottleneck, existing approaches fail to achieve efficiency gains comparable to those observed in text generation. A key limitation is the target model's high uncertainty during image generation, which leads to high draft token rejection rates. In this work, we identify previously overlooked patterns in the target model's behavior that emerge naturally in tree-based speculative decoding. Specifically, we formalize two properties, semantic interchangeability and convergence, arising from the redundancies in the target model's hidden state representations. By capturing these redundancies across the depth and breadth of the predicted token tree, our method identifies principled opportunities for acceptance relaxation without requiring additional training. Additionally, we enhance standalone drafter performance by injecting the redundancy signals from the target model into drafter training with minimal modification. We evaluate our approach across multiple text-to-image models and drafter architectures. Results show that CASCADE achieves state-of-the-art speedups for drafter-based speculative decoding, with up to 3.6x acceleration, while maintaining image quality and text-prompt fidelity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes