CVAINov 21, 2025

Spanning Tree Autoregressive Visual Generation

arXiv:2511.17089v1
Originality Incremental advance
AI Analysis

This addresses a bottleneck in image generation and editing for researchers and practitioners by providing a method that balances performance and flexibility.

The paper tackles the problem of maintaining sampling performance while enabling flexible sequence orders for image editing in autoregressive visual generation by introducing Spanning Tree Autoregressive (STAR) modeling, which uses traversal orders of uniform spanning trees to achieve this without significant architectural changes.

We present Spanning Tree Autoregressive (STAR) modeling, which can incorporate prior knowledge of images, such as center bias and locality, to maintain sampling performance while also providing sufficiently flexible sequence orders to accommodate image editing at inference. Approaches that expose randomly permuted sequence orders to conventional autoregressive (AR) models in visual generation for bidirectional context either suffer from a decline in performance or compromise the flexibility in sequence order choice at inference. Instead, STAR utilizes traversal orders of uniform spanning trees sampled in a lattice defined by the positions of image patches. Traversal orders are obtained through breadth-first search, allowing us to efficiently construct a spanning tree whose traversal order ensures that the connected partial observation of the image appears as a prefix in the sequence through rejection sampling. Through the tailored yet structured randomized strategy compared to random permutation, STAR preserves the capability of postfix completion while maintaining sampling performance without any significant changes to the model architecture widely adopted in the language AR modeling.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes