CVJul 20, 2022

NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis

Microsoft
arXiv:2207.09814v299 citationsh-index: 74Has Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of infinite visual synthesis for applications requiring flexible image and video generation, representing an incremental improvement over existing models.

The paper tackles the problem of generating arbitrarily-sized high-resolution images or long-duration videos by proposing NUWA-Infinity, which uses an autoregressive over autoregressive mechanism with a Nearby Context Pool and Arbitrary Direction Controller, achieving superior capabilities in resolution and variable-size generation compared to models like DALL-E, Imagen, Parti, and NUWA.

In this paper, we present NUWA-Infinity, a generative model for infinite visual synthesis, which is defined as the task of generating arbitrarily-sized high-resolution images or long-duration videos. An autoregressive over autoregressive generation mechanism is proposed to deal with this variable-size generation task, where a global patch-level autoregressive model considers the dependencies between patches, and a local token-level autoregressive model considers dependencies between visual tokens within each patch. A Nearby Context Pool (NCP) is introduced to cache-related patches already generated as the context for the current patch being generated, which can significantly save computation costs without sacrificing patch-level dependency modeling. An Arbitrary Direction Controller (ADC) is used to decide suitable generation orders for different visual synthesis tasks and learn order-aware positional embeddings. Compared to DALL-E, Imagen and Parti, NUWA-Infinity can generate high-resolution images with arbitrary sizes and support long-duration video generation additionally. Compared to NUWA, which also covers images and videos, NUWA-Infinity has superior visual synthesis capabilities in terms of resolution and variable-size generation. The GitHub link is https://github.com/microsoft/NUWA. The homepage link is https://nuwa-infinity.microsoft.com.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes