Transform the Set: Memory Attentive Generation of Guided and Unguided Image Collages
This work addresses a specific problem for digital artists by providing a machine-learning tool for flexible image recombination, though it appears incremental as it builds on existing attention and set-input methods.
The paper tackles the problem of generating image collages from sets of source templates, which is challenging for classical convolutional neural models due to variable input sizes, and presents a novel architecture called MAGIC that enables one-forward-pass generation using set-structured representations, resulting in a new GAN-based framework for digital collage creation.
Cutting and pasting image segments feels intuitive: the choice of source templates gives artists flexibility in recombining existing source material. Formally, this process takes an image set as input and outputs a collage of the set elements. Such selection from sets of source templates does not fit easily in classical convolutional neural models requiring inputs of fixed size. Inspired by advances in attention and set-input machine learning, we present a novel architecture that can generate in one forward pass image collages of source templates using set-structured representations. This paper has the following contributions: (i) a novel framework for image generation called Memory Attentive Generation of Image Collages (MAGIC) which gives artists new ways to create digital collages; (ii) from the machine-learning perspective, we show a novel Generative Adversarial Networks (GAN) architecture that uses Set-Transformer layers and set-pooling to blend sets of random image samples - a hybrid non-parametric approach.